Building and cross-compile tutorial

Saturday, December 6, 2014

Disclaimer: This article is still a draft. You may find errors of various nature in the text below. In any case, if you follow the instructions below, you are doing it at your own risk! ;)

So, I’ve spent a lot of time lately trying to cross-compile a few projects of mine to a couple of different embedded platforms. It’s been a little painful, but in the end I succeeded, so I think it’s an experience that worths sharing…

Also, I lately found that the compilation and linking processes are not fully understood by a large part of “youngsters”, so I’d like to start from the very beginning.

Building a program

Let’s start by addressing the problem of building a program. When talking about small programs, it is actually very simple, sometimes as simple as a single command line with few parameters, but in medium/large projects even the compilation for the host architecture can be tricky, so it is better to make this clear.

Ooops! I just realized that I am using some terms that could be new to you. So let me introduce some terminology.

‘host’ is the machine (or architecture, or platform) that you are using to compile the code;
‘target’ is the machine (or architecture, or platform) that is intended to run the code.

In the ‘usual’ building process, the ‘host’ and ‘target’ platform are the same. For example, you use your PC to compile a code to make it run on the same PC. In cross-compilation, the ‘host’ and ‘target’ platform differs. For example, you may want to use your PC to compile a program that is meant to run on your Raspberry-Pi.

Compiling the code

Ok, let’s start with the usual ‘Hello World!’ example. Here’s the source code:

#include <stdio.h>

int main(int argc, char* argv[])
{
    printf("Hello World!\n");
    return 0;
}

Let’s say that this simple code is saved to the helloworld.c file. You can compile with this very simple line:

gcc -c helloworld.c

This will produce helloworld.o which is an object file. If you didn’t get any error from gcc your code is syntactically correct, but it cannot be executed yet. You need to link the object file to the libraries that contains the runtime and possibly other code you may be invoking in your program.

Linking the code

The ld command in Linux invokes the linker, which is the tool you need. However, to the best of my knowledge very few people use ld explicitly. It is much easier to call gcc and have it call the actual linker, since this will hide much of the complexity to you. For example, to turn your helloword.o object file into an executable binary file you should provide ld with the exact name and path of the standard C runtime for you platform. This can be done with a little effort but it surely much easier to write:

gcc helloworld.o

gcc will understand from the parameter you passed that it need to invoke ld, and will pass all the parameters it needs to link the object code to the C runtime. As a result, you will find a new file in your folder, named a.out. That is your executable program. Now calling:

./a.out

will produce:

Hello World!

Include directories

What happens when the code is slightly more complex than the ‘hello world’ example above? Well, you’ll likely need to add a few arguments to your gcc command. The first thing you will probably need are a few ‘include dirs’. Include directories are the paths to all headers files that are needed by your code due to the #include directives it contains. Please note that these directives are nested and thus you may need to include header files you never heard about just because they are included in some file YOU included. Include dirs are passed to gcc with -I option, like this:

gcc -c -I./include helloworld.c

This command line will tell gcc to look for .h files in the include/ folder. Note that the path is relative, but you can obviously use absolute paths, and more than one path at a time:

gcc -c -I./include -I/usr/local/include helloworld.c

Note that the include paths do not need to actually exists. If they are not found, the compiler won’t complain (usually). Instead, if you miss an include directory that’s actually needed, you will get errors at compile time like this:

helloworld.c:2:22: fatal error: myheader.h: No such file or directory

This line says that the file helloworld.c tried to include myheader.h on line 2, but the compiler was not able to find that header file anywhere. It is important to note at this point that gcc has a list of notable locations it will check for header files in any case, but this location usually contain only header files from system libraries or other libraries you (or your sys-admin) installed system-wise. Your local header files are likely stored in some other (local to your home folder) directories and thus you will need to add their path as an include dir with the -I option to gcc.

Making a library

If you need to pack your code into a library, then you probably need the compiler only. I won’t go deep into details now, since I will probably be back on this topic in another post. However, things go differently depending on what kind of library you want to compile: a static libary, or a shared library.

Static

To build a static library, you need to compile the source code to obtain the object files, and then use the archiver ar to pack everything into a single .a file. Here’s an example:

gcc -c -I./include -I/usr/local/include -o my_library_obj_file.o my_library_source_file.c
ar rcs my_library.a my_library_obj_file.o

Shared

Building a shared library is a little different from building a static one. This time, you can do everything with gcc, without calling the archiver, but will need to specify a few more parameters:

gcc -fPIC -shared -I./include -I/usr/local/include -o my_shared_library.so my_library_source_file.c

Cross-compilation

Cross-compilation is the process that allow you to compile code that is supposed to run on a ‘target’ architecture, and to do that while working on a different (‘host’) architecture. For example, you may want to compile a program for your Raspberry-Pi (the target architecture) on your laptop (the host architecture).

What you need

Basically, to cross-compile a program or library you need two things:

a tool-chain running on your host, targeting your target architecture;
the file system of your target machine (“sysroot” in the following).

The tool-chain can be achieved in many different ways. The easiest is undoubtedly to find a .deb or .rpm package to install the tool-chain on your host system. For example, this is possible when the target architecture is the Raspberry-Pi and the host is your PC (see https://github.com/raspberrypi/tools for details). If a binary package is not available, you may need to compile a custom tool-chain from scratch! In this case, tools like crosstool-ng may help (http://crosstool-ng.org/#introduction)

The sysroot is a mere copy of the file system of your target platform. Actually, you do not need to copy the entire file system on your host: the folders /usr and /lib would suffice.

It is a good idea to keep all these things gathered in a single place. I suggest you create a folder (e.g. x-compile) and store the tool-chain and the sysroot in there. Be tidy, because things can easily become a painful mess!

Satisfy the dependencies

When you start porting a code to a specific target platform, it is likely that the first problem you will face is to satisfy a few (many?) missing dependencies. This problem is easy to solve in principle, but can easily mess things up to a level you wouldn’t imagine.

If the code depends on some library that is NOT in the sysroot, there’s no way out but to find them somewhere, somehow. Dependencies can be satisfied in two ways: with static libraries or with shared libraries. If you are lucky, you could find a binary package providing what you need (i.e. the library files AND the header files), but most often you will have to cross-compile the source code on your own. Either ways, you end up with one or more binary files and a bunch of header files. Where to put them? Well, that depends. There are a few different situations that can happen, but basically everything reduces to two cases:

In the sysroot. If you are satisfying the dependencies with shared libraries (.so files) this is probably the most common solution (and maybe, the best solution). Remember that when everything will be up and running, these libraries must be installed somewhere in the file system of the target platform. So there is a natural answer to the question above: install them in the target sysroot, for example in /usr/lib (the binary shared files) and /usr/include (the header files). Or in any other path that allow the loader to find those libraries when the program executes. AND, install them in the file system of the actual target machine, in the same places, in order to make everything work as expected. Please note that static libraries (‘.a’ files) does not need to be installed in the target file system since their code is embedded in the executable file when you cross-compile a program.
In a different folder. This could be an interesting solution to keep the libraries that you cross-compiled on your own separated from the other libraries (for example, the system libraries). You can do that if you want (I often do that!) but if you do, you must remember to provide to compiler and linker programs with the paths where header files and binary files can be found. With static libraries, this information are only needed at compile and linking time, but if you are using shared libraries, this won’t suffice. You also must specify where these libraries can be found at run time.

How to do that

Ok, enough talking. Now let’s see HOW to actually cross-compile. I will assume that:

You have your tool-chain installed, that it is the correct tool-chain and the PATH environment variable is correctly set, so that the cross-compiler and all other cross-tools binaries can be called from any folder.
You have the sysroot installed in ~/x-compile/sysroot
Your code depends on a library for which you have the source code in ~/x-compile/depsrc/
You have the source code to be cross-compiled in ~/x-compile/src

Given that all above applies to you, cross-compilation requires the following steps.

Cross-compile the dependecies

As said, when you cannot find a binary package for a give library your code depend upon, you have to cross-compile a version of it for your target platform. This can only be done, obviously, if the source code is available for that library, for example if it is open source. In my world, this is often the case. I hope so for yours… ;-)

Many open source libraries use auto-tools to compile, which means that for these libraries the compilation requires the following commands (DON’T DO THIS YET):

./configure
make
sudo make install

Since what we are trying to do is cross-compile the library, we will need something different from the usual commands above. Here’s an example:

./configure CC=arm-linux-gnueabihf-gcc --prefix=~/x-compile/deps --host=arm-linux-gnueabihf
make
make install

The meaning of these commands is the following (proceeding in order, from top to bottom):

we call the configure script passing a few parameters. The first tells configure to use the cross-compiler instead of the usual gcc; the second sets the destination folder for compilation products; the third sets the architecture of the host that will be running the binaries.
call make, which is a GNU meta-build tool (I would rather say THE meta-build tool) that uses so-called makefiles to build a project. This actually perform the compilation and linking steps
call make with the install target, which means we are asking make to install the binaries to the folder we previously set with the --prefix option.

Cross-compile your code

To cross-compile your code you obviously need to invoke the cross-compiler coming with the tool-chain you installed. I will refer to the case where the Raspberry-Pi is the target architecture, either because it is a quite common case and because it is the latest experiment I tried :).

The tool-chain compiler is usually a particular version of gcc. Typically, the binary name is prefixed with a string identifying the target architecture. For the Raspberry-Pi architecture, a common tool-chain provides arm-linux-gnueabihf-gcc. For very simple programs, cross-compiling turns out to be as simple as using this cross-compiler instead of the usual gcc:

arm-linux-gnueabihf-gcc -o hello_world hello_world.c

but things get more complex when the code is not trivial. In the case I described in the previous section, the command line would be something like this:

arm-linux-gnueabihf-gcc --sysroot=~/x-compile/sysroot -I./include -I/usr/local/include -L~/x-compile/deps -o hello_world -lmy_shared_library ~/x-compile/deps/my_static_library.a hello_world.c

Quite complex, isn’t it? We have many more parameters and options in this command line, let’s give a closer look.

--sysroot=~/x-compile/sysroot is a very important option, since it tells the cross-compiler to resolve all paths in the -I and -L options with respect to the given path. So, we are basically saying that the ./include and the /usr/local/include folders should be first look for in ~/x-compile/sysroot.
-L~/x-compile/deps adds the path ~/x-compile/deps to the list of paths where static (.a) and shared (.so) libraries are searched at compile and linking time. I am supposing that there exist two libraries: my_static_library.a and libmy_shared_library.so within the ~/x-compile/deps folder
-lmy_shared_library tells the linker we are linking against libmy_shared_library.so (remember what I said above about the -L option…)
~/x-compile/deps/my_static_library.a simply tells the linker to include the code from this library (the complete path could be omitted thanks to the -L option)

That should build a binary executable file for your target architecture (which is formally armv6l for the Raspberry-Pi). You can verify that by using the command file on the result:

file hello_world

You should see a line of text containing the word amrv6l somewhere. If it is missing, then something went wrong and what you get is not an executable for the Raspberry-Pi.

Install your code to the target architecture and make it run like a champ!

At this point, you probably have already copied the binary file to the Raspberry (or your target machine) and see that it does not work… :) Keep calm, we are almost done. If the program fails by saying it was unable to load (or find) a .so library, it is because we didn’t tell the loader where that library can be found. And if everything was done correctly, the error should refer to our dependency, libmy_shared_library.so. If so, there are a few ways you can fix things:

copy libmy_shared_library.so to a place that the system looks into for other libraries, for example /usr/lib or /usr/local/lib
copy libmy_shared_library.so wherever you like and start the program like this:

LD_LIBRARY_PATH=/path/to/the/folder/containing/the/library ./hello_world
modify the value of LD_LIBRARY_PATH environment value before calling the program:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/the/folder/containing/the/library ./hello_world

All of this should work. Symbolic links are also ok, so if you prefer you may just create a symlink in /usr/lib poiting to libmy_shared_library.so, wherever it is placed. But the solution I prefer is a little different: I like to set an rpath into the binary file of my program.

An rpath is a path that will be stored within the binary file itself, and that the loader will use to look for libraries when every other path have been checked. To do this, you have to add a few other option to your gcc command line, like this:

arm-linux-gnueabihf-gcc --sysroot=~/x-compile/sysroot -I./include -I/usr/local/include -L~/x-compile/deps -o hello_world -Xlinker -rpath=./ -lmy_shared_library ~/x-compile/deps/my_static_library.a hello_world.c

The -Xlinker -rapth=./ tells the linker to add ./ as an rpath when it creates the binary file. In this way, you can simply put your dependencies in the same folder as the executable binary file. I think it is a very practical solution to distribute an application with its own dependencies without having to install the libraries system-wide.