Make and Makefiles

Summary in four points

Here (as a quick introduction, or later as a reminder) is the bare minimum you need to know (but you are, of course, welcome to read on):

a Makefile is just a simple text file (if it's simply called "Makefile" on its own, with no extension), which is automatically called by the make command, and which simply contains a "to-do list" (known as "targets");
one line of the Makefile simply describes one target and what is needed to make it (known as "dependencies"), in the format:
```
 target: list of dependencies
```
for example (fictitious):
```
 cake: flour eggs butter sugar chocolate yeast
```
and that's it! Simple as that! Except that for us, targets are executables and dependencies are .o files; for example:
```
 calculCplx: calculCplx.o complex.o calculator.o
```
compilation dependencies (for the creation of a .o file, then) are simply the corresponding .c file, together with the list of required .h files; e.g.:
```
 calculator.o: calculator.c calculator.h complex.h
```
Note that all these target-dependency lines for compilation can be obtained simply by typing the command:
```
 gcc -MM *.c
```
Often, by convention, the first target is called "all" and designates all the executables you wish to build with this Makefile.

To sum up, here's a simple but complete example of a Makefile:

all: calculCplx

calculCplx: calculCplx.o complex.o calculator.o

# These lines were copied from the gcc -MM *.c command

complex.o: complex.c complex.h

calculator.o: calculator.c calculator.h complex.h

calculCplx.o: calculCplx.c calcGUI.h

And that's it! As simple as this!

Compiling programs

Note: this is a written tutorial. You might prefer the video lectures; choose your favorite learning way (or maybe benefit from both).

For the sake of modularization, the source code of a complete program written in C is often distributed over several text files called "source files". Source files are of two kinds: header files and main files (often called "definition files", or even simply "source files", hence some terminological confusion). By convention, header files have the .h extension, while definition files have the .c extension.

These are "glued together" by the compiler to create an executable program from the source code.

A pair (header file, definition file) corresponding to a given concept is called a "module".

What's the purpose of a header file, then?

A header file is ther to announce to the other modules the functionality (API) provided by the module it is part of.

For example, a matrix.h file will contain the module's API for matrices.

In header files, we typically write:

#pragma once (see below);
directives to include the other header files necessary for this header file only (see below);
(very frequent) declarations of types offered by the module;
(very frequent) declarations of the functions offered by the module (corresponding to the "public" part in an OO design);
(frequent) some "macros" (lines beginning with the #define symbol);
(rare) declarations of (global) variables to be shared with other modules by the current module.

In the definition file (with extension .c), we typically write:

directives to include the header files necessary for this source file only (see below);
declarations of variables or functions used exclusively in the current module;
definitions of (variables and) shared functions (offered by the header file).

Header files are not compiled directly into machine code, but their content is copied as a whole into all other modules that include them. These other modules (which need them) request a copy of a header file by indicating #include followed by the header file name. For example:

#include "matrix.h

in a source file that requires matrices.

This copy is made by the compiler when compiling the module requesting the inclusion.

[ Note: the inclusion of "local" files (specific to our application) is written with double quotation marks (e.g. #include "matrice.h"), whereas the inclusion of standard libraries is written with "angle brackets" (e.g. #include <stdio.h>)
]

Compiling a program consists of two main stages:

the actual compilation stage:
- syntax is checked;
- variables and function calls are checked to ensure that all declarations exist;
- the corresponding machine code is created in "object" files (with the extension .o);
the "linking" stage:
- check that function calls correspond to their definition;
- and that only one definition exists for each function called;
- object files are linked together to create the final executable program.

Let's take a look at two examples.

Example 1: a single file (as in your usual exercises)

The sum_odd.c file provided in done/ex_single is a (single) source file containing the code to request a positive number n and then calculate the sum of n first odd numbers.

The program starts with a #include <stdio.h> directive which requests the inclusion (= copying) of standard definitions (std) for input-output (io), such as printf().

Try following the steps illustrated in the image below:

Illustration of dependencies

These steps are automatically performed (transparently) when you compile an IDE. But, in order to understand well, let's do them step by step.

First, we'll create the object "files" (here, only one) using the following command:

gcc -c sum_odd.c -o sum_odd.o

The -c option tells the compiler not to perform linking, but only compilation (hence the c as "compile").

This option is followed by the name of the file from which you want to create the object file, then the name you want for the object file in question (the -o option means "output").

Run this command and check that the object file is actually present in the directory. Don't try to read or open it - it's machine code!

Next, you need to link the object files. And here, there are already several of them, unbeknownst to you: the one created from our source file and those of the standard libraries used, which are automatically linked by the compiler without our having to name them explicitly.

To make these links, we simply use the following command:

gcc -o sum_odd sum_odd.o

Once again, the -o option followed by the name of the desired file (in our example, the file is called odd_sum) is used to create the executable program with that name. Note that you can put this option and its associated file name wherever you like in the command (here we've put them first, whereas in the previous example, compiling, we put them last).

Then we need to specify the files to be linked together to create the executable program. In our example, all we need to do is specify our only sum_odd.o (as standard libraries are linked automatically).

Check that the executable program has been successfully created and run it from the terminal by typing:

./sum_odd

Example 2: several files

A large program is usually broken down into several modules. In addition to bringing clarity to the program organization, this technique (known as "modular design") enables the reuse of elements (modules) for different programs (for example, one module for matrices, another for "ask for a number", etc.).

Let's take a look at how such programs are produced.

In the done/ex_multiples directory, you'll find five source files and four header files.

Look at the contents of all the files and try to reconstruct the dependencies illustrated below:

Illustration of dependencies 2

To create such a program, you must first compile all .c files into object files:

gcc -c array_filter.c
gcc -c array_sort.c
gcc -c array_std.c
gcc -c swap.c
gcc -c main.c

And then produce the executable (called selection_sort in our example):

gcc -o selection_sort array_filter.o array_sort.o array_std.o swap.o main.o

Create the executable as described above (tedious, isn't it? We'll come back to that in the next section), then run it. Its purpose is to sort, using the "selection sort" algorithm, an array of integers, whose size and range of values are given by the user.

Protection against multiple inclusions

What happens if, by mistake or indirectly, the same module header is included several times? For example, have you ever tried to include a "#include <stdio.h>" twice in one of your programs?

If .h files are not protected against multiple inclusions, the compiler may refuse to compile, for example because of redefinition of a type already defined in the first inclusion.

It is therefore necessary to protect your .h files against multiple inclusions by starting them with the line:

#pragma once

This must be the very first line of your .h files.

Automating compilation with `make`

Introduction

In the case of large (modular) programs, compiling and linking can become tedious (perhaps you've already found it to be the case for just 5 modules...): you have to compile each module ("separate compilation") in its own object file, then "link" all the object files produced.

And since it's highly likely that several modules will themselves make call upon other modules, a modification to one of the modules may require to recompile not only the modified module, but also those that depend on it, recursively, and of course the final executable.

The make tool enables you to automate the sequence of commands that are dependent on each other. It can be used for many purposes, but its primary use (and the one we're interested in here) is the compilation of (executable) programs from source files. Benefits:

you don't have to do it by hand;
it recompiles only what is strictly necessary.

To use make, all you have to do is write a few simple rules describing the project's various dependencies in a simple text file named Makefile (or makefile).

Let's see how this tool is presented to us, in its manual:

man make

(Don't read everthing! Just an overview to get an idea what it is about.)

Makefile structure

A Makefile is essentially made up of rules, which define, for a given target,

all the dependencies of the target (i.e. the elements on which the target depends),
as well as the set of commands to be performed to update the target (from its dependencies).

It's a bit like a list of recipes:

"rule" = recipe;
"target" = result (e.g. chocolate cake);
"dependencies" = ingredients (e.g. flour, eggs, chocolate, sugar, butter);
"commands" = instructions for making the recipe.

But we're not cooking here. If we illustrate these concepts with the previous example (program selection_sort), we'd have, for example a rule for linking (program selection_sort), another rule for compiling array_sort.c (into array_sort.o), and so on.

For the linking rule, we'd have:

target: selection_sort;
dependencies: array_filter.o, array_sort.o, array_std.o, swap.o and main.o.

all these .o files must exist to produce the selection_sort executable;
command: the linking command used above.

For the array_sort.c compilation rule, we would have:

target: array_sort.o;
dependencies: array_sort.c, swap.h, array_filter.h (see previous figure, which shows the dependencies);
command: gcc -c array_sort.c.

Definition and operation of rules

The general syntax of a rule is:

target: dependencies
[tab]command 1
[tab]command 2

where:

target is most often the name of a file that will be generated by the commands (the executable program, object files, etc.), but it can also represent a "fileless" target, such as install or clean;
dependencies are the prerequisites for the target to be achievable, usually the files on which the target depends (e.g. declaration files like header files), but they can also be rules (e.g. name of the target of another rule);

to specify several dependencies, simply separate them with a space; a rule may also have no dependencies;

if a dependency occurs several times in the same rule, only the first occurrence is taken into account by make;
the commands are the actions that make must undertake to update the target; they are one or several shell commands;

we have one command per line, and group the commands related to a target below the dependency line;

a special syntax feature is that each command line must begin with the tabulation character ("TAB" key), and NOT spaces; this is certainly the most archaic and enoying aspect of make!

It is possible to omit commands for a target; then either a default rule applies, or nothing at all (which might be useful simply for forcing dependencies/checks).

In fact, make has a number of implicit rules (typically for compilation), so we don't have to write too many things, as we'll see below.

Another good news is that you can automatically generate a list of all dependencies using the -MM option in gcc:

gcc -MM *.c

Try it out! You should immediately see the link between the list of all dependencies. It's very handy to put them at the end of your Makefile.

Note that the order of the rules is not important, except when determining the default target (i.e. when the user types make on its own, without any arguments: the first rule is then launched; otherwise, simply type make target on the command line).

Exercises and examples

The simplest example of Makefile is... ...an empty file!

Thanks to its implicit rules, make already knows how to do(=make) lots of things without you having to write anything.

Exercise 1

(in done/ex_single) Delete the files sum_odd.o and sum_odd and run make like this:

make sum_odd

All done. Great!

make "knows" that to make an X file from a X.c source file, you need to call the C compiler.

If you wanted to write a Makefile to do this, you could have written (try it!):

sum_odd: sum_odd.c

and that's it!

The target here is the sum_odd executable and its dependency, unique here, the sum_odd.c source file.

This Makefile does not specify any commands to be executed. It simply uses the default commands known to make.

Would we want to make the command more explicit (but why?), a more complete Makefile would have been:

sum_odd: sum_odd.c
	gcc -o sum_odd sum_odd.c

where the command to switch from the dependency to the target is made explicit (preceded by an TAB character).

Exercise 2

Let's try to write a completely artificial Makefile:

all: dep1 dep2
    @echo "target 'all' completed."

dep1:
    @echo "dependency 1 completed."

dep2:
    @echo "dependency 2 ok..."

dep3:
    echo "banzai!"

(You can either add these lines to the Makefile written for sum_odd if you tried the exercise above, or now create a Makefile file with the above lines).

If you simply type the command

make

you get:

dependency 1 completed.
dependency 2 ok...
target 'all' completed.

In this example, make is called on its own, with no indication of a particular target. make will thus search the Makefile for the first acceptable target, in this case all.
(There are particular targets that are not acceptable as default targets, but this is beyond the scope of this introduction.)

The rule for this target specifies two dependencies, dep1 and dep2, which don't exist (they don't correspond to any existing files); make will thus attempt to create them successively.

Since dep1 has no dependencies, make immediately proceeds to executing the commands accompanying the target, i.e. display a message on the terminal (using the echo command).

The same applies to the second dependency (dep2).

Once all dependencies have been realized, make returns to the the initial target, all, the build commands of which gets executed.

If we now type the command

make dep3

you get:

echo "banzai!"
banzai!

In this example, the target dep3 is specified as the goal when invocating make. This target has no dependencies; make thus directly executes the build commands for this target (displaying the string "banzai!").

Let's note a slight difference in behavior between our two examples: in the first case, the target is created by executing the commands directly, whereas in the second case, make first displays the command it will execute ("echo "banzai!"").

The reason for this behavior lies in the @ character preceding the command in the first case, and absent in the second. By default, make first displays the commands it will execute before actually calling it. To suppress this automatic display simply prefix the command with the @ character.

Tip: always let make display the commands it is supposed to do (especially compilations), except for pure display commands, such as echo.

Compiling with `Makefile`

That's all interesting, but what use is it "in real life", since we've seen that with the default implicit rules we don't need to write anything?
Sure! But in more complex projects, the default rules are no longer sufficient.

Let's say we've a program to implement a calculator for complex numbers, splited into modules as follows:

in addition to the standard library, we have a graphics library, LibGraph, with its header file, libgraph.h, and a library file libgraph.so;
modeling of complex numbers and their arithmetic, with its header file complex.h and its implementation file complexe.c;
calculator modeling (basic functions, memory, parenthesis, etc.), with its header file calculator.h, which depends on complexe.h, and source file calculator.c (no dependency);
modeling of the calculator's graphical interface, with calcGUI.h, dependent on calculator.h and libgraph.h, and calcGUI.c;
the main program (containing the main() function), provided as calculCplx.c file, which depends on calcGUI.h;
each source code (.c) also depends on its header file (.h).

Here's an illustration:

Illustration of previous dependencies

To write the corresponding Makefile, all we have to do is to add

a target for each module, i.e. one target for each object file resulting from compilation of the source file;
and another one to link the whole into an executable program.

The dependencies of each of these targets are all the files it depends on (!). But we only consider dependencies that can be modified as part of our project. We can therefore ignore dependencies on the graphics library, for example, just as we ignore dependencies on any other standard library.

These dependencies can be automatically generated using the command

    gcc -MM *.c

All we have to do is to copy its result into our Makefile.

The build commands are, of course, the compilation instruction; but we don't need to explicitely write it, as we have seen above: make has default commands which are perfectly fine in this case.

The only build command that needs to be specified is the "linking" command, which puts all the object files together to form the final executable. This is because the default linking rule will not make use of the required libgraph library.

A possible Makefile could therefore be:

 all: calculCplx

 calculCplx: calculCplx.o complex.o calculator.o calcGUI.o
     gcc -o calculCplx calculCplx.o complexe.o calculatrice.o calcGUI.o -lgraph

 # These lines have been copied from gcc -MM *.c
 complex.o: complex.c complex.h
 calculatrice.o: calculatrice.c calculatrice.h complex.h
 calcGUI.o: calcGUI.c calcGUI.h calculator.h
 calculCplx.o: calculCplx.c calcGUI.h

With such a Makefile, our project can be compiled using the make command alone, as the first target, the all target, here is an alias for the calculCplx target.

To build this target, make must first build the targets indicated as dependencies (the set of object files files).

Note that make will only (re)construct a target if at least one of its dependencies is more recent than the target itself. It is this mechanism that enables make to compile only what is strictly necessary. So, if you run the the make command a second time, after the first compile compilation, the program will report:

make: Nothing to be done for `all'.

which means there's nothing new to be done! Everything is up to date.

Similarly, if you were to modify only the file complex.c file, the make command would only lead to the recompilation of the latter (creation of the target complexe.o, since it's one of its dependencies), an the linker command, which in turn updates the target calculCplx (for the same reason as above).

If, on the other hand, the complexe.h file is modified, the targets complex.o, calculator.o and calculCplx will be updated.

Finally, it should be noted that some libraries, particularly our own, must be specified when linking: this is the case, for example, the graph library. This is done by adding the -lgraph option to the end of the linker command; thus the reason for having to write the build command explicitely.

Exercise 3

In the done/ex_multiples directory, create a Makefile to compile the selection_sort program described above.

Test it.

There's a slight subtlety here: there's no selection_sort.c, but the main() function is in main.c. This is simply to make you write a rule once (instead of using the default rule). Obviously, main.c would "normally" be called selection_sort.c. But you're not allowed to rename this file (or make a symbolic link;-)`).

Conclusion and next steps

That's pretty much about the basics. The rest of this document described more advanced stuff, not strictly necessary for you, but can be useful if you want to go further than the bare minimum.

And if you'd prefer a more "classroom" video/presentation on the subject of separate compilation and Makefile, here's a few lecture videos (52 min.).

If what has been presented here is enough for you (you've already spent enough time), you can simply continue this week's series where you left it.

Advanced elements (but so useful!)

What has been presented so far is sufficient to enable you to write a functional Makefile; however, as the previous example show, writing a functional Makefile may relatively tedious. The information in this section will enable you to considerably increase the expressive power of the Makefile instructions, making them easier to write.

Defining and using variables

To make writing Makefiles easier (and more concise), you can define and use variables (actually, they're more like macro-commands, but who cares?)

The general syntax for defining a variable in a Makefile is:

NAME = value(s)

(or its more advanced variants +=, :=, ::=, ?=)
where:

NAME: the name of the variable you wish to define; this name must not contain the following characters :, # or =, nor accented letters; the use of characters other than letters, numbers or numbers or underscores is strongly discouraged;

variable names are case-sensitive;
value(s): a list of strings, separated by spaces.

Example:

RUBS = *.o *~ *.bak

Note also that for GNU make (also called gmake), the following syntax can be used to add one or more elements to the list of values already associated with a variable:

NAME += value(s)

To use a variable (i.e. to substitute it for the list of values associated with it), simply enclose the variable name in parentheses, preceded by the $ sign:

$(NAME)

Example:

-@$(RM) $(RUBS)

which, with the above definition of RUBS, deletes all *.o, *~ and *.bak files; the RM variable is one of the predefined variables in make (remove the @ to see the command actually executed).

Note: These variables can be redefined when calling make; e.g.:

make LDLIBS=-lm ma_target

redefines the LDLIBS variable.

Example of using variables

Suppose we want to systematically specify a certain number of options to the compiler; e.g. to enable the use of a debugger (-g), to force a level 2 optimization of the compiled code (-O2), and to make the compiler stricly comply the C17 standard (-std=c17 -pedantic).

Rather than adding each of these options to every compile command (and having to re-modify everything when we want to change those options), it would be wiser to use a variable (for example CFLAGS, which is the default name used by make) to store the options to be passed on to the compiler. Our Makefile would then become:

 CFLAGS = -std=c17 -pedantic
 CFLAGS += -O2
 CFLAGS += -g 

 all: calculCplx

 calculCplx: calculCplx.o complexe.o calculatrice.o calcGUI.o
     gcc -o calculCplx calculCplx.o complexe.o calculatrice.o calcGUI.o -lgraph

 # These lines have been copied from gcc -MM *.c
 complex.o: complex.c complex.h
 calculatrice.o: calculatrice.c calculatrice.h complex.h 
 calcGUI.o: calcGUI.c calcGUI.h calculator.h 
 calculCplx.o: calculCplx.c calcGUI.h

Comments

It's possible to add comments in a Makefile (line-oriented, i.e. like the the //... of C99 or Java), by marking the beginning of the comment with the # symbol. Note that comments in command lines are not removed by make before its execution by the Shell. For example:

# Here's a comment line

all: dep1 dep2
    @echo "target 'all' completed."

dep1:
    @echo "dependency 1 completed."

dep2:
    @echo "dependency 2 ok..."

dep3: # this target is not built by default
    echo "banzai!" # comment submitted to Shell

Examples of execution:

$> make

dependency 1 completed.
dependency 2 ok...
target 'all' completed.

$> make dep3

echo "banzai!" # comment submitted to Shell

banzai!

Notice that the # comment submitted to Shell is indeed passed to the Shell, but since # is also the comment-character for the Shell, it is considered as a comment by the Shell.

"Automatic" variables

make automatically maintains a number of predefined variables, updating them as each rule gets executed, depending on the target and its dependencies.

These variables include:

$@ name of the target (file) of the current rule;
$< list of dependencies as calculated by default make rules;
$? list of all dependencies (separated by a space) more recent than the current target (dependencies involving target updates);
$^ [GNU Make] list of all dependencies (separated by a space) on the target; if a dependency occurs several times in the same dependency list, it will only be reported once;
$(CC) compiler name (C);
$(CPPFLAGS) precompilation options;
$(CFLAGS) compiler options;
$(LDFLAGS) linker* options;
$(LDLIBS) libraries to be added.

For instance, the calculator's Makefile could be rewritten as follows (modification of the linker command):

 CFLAGS = -std=c17 -pedantic
 CFLAGS += -O2
 CFLAGS += -g 

 all: calculCplx

 calculCplx: calculCplx.o complex.o calculator.o calcGUI.o
     gcc -o $@ $^ -lgraph

 complex.o: complex.c complex.h
 calculator.o: calculator.c calculator.h complex.h 
 calcGUI.o: calcGUI.c calcGUI.h calculator.h 
 calculCplx.o: calculCplx.c calcGUI.h

Implicit rules

As mentioned above, make has a number of implicit rules (i.e. rules that the user doesn't need to specify), which enable it to "behave" in the presence of a source file without any further instructions. For instance, it "knows" how to produce object files from sources in assembly, Fortran, Pascal, Modula-2, Yacc, Lex, TeX, ..., and of course C and C++.

For example:

the target file.o will be automatically created from the file file.c by means of an (implicit) command of the form:
```
  $(CC) -c $(CPPFLAGS) $(CFLAGS) -o $@ $<
```
which can also be simplified to
```
  $(COMPILE.c) -o $@ $<
```
Usually, the CC variable is associated to the cc command.
a target file can be automatically created from the file.o object file, or from a set of object files (specified in the list of dependencies) of which file.o is a part, such as x.o file.o z.o, using a command of the form:
```
  $(CC) $(LDFLAGS) -o $@ $< $(LOADLIBES) $(LDLIBS)
```
a target file can be automatically created from the file.c source file, and possibly a set of object files (specified in the list of dependencies), such as y.o z.o, using a command of the form:
```
  $(CC) $(CPPFLAGS) $(CFLAGS) $(LDFLAGS) -o $@ $< $(LOADLIBES) $(LDLIBS)
```
which can be simplified to
```
  $(LINK.c) -o $@ $< $(LOADLIBES) $(LDLIBS)
```

Therefore, we can transform our previous Makefile to make it even more concise, as follows:

 CPPFLAGS = -std=c17 -pedantic
 CPPFLAGS += -O2
 CPPFLAGS += -g 

 all: calculCplx

 complex.o: complex.c complex.h
 calculator.o: calculator.c calculator.h complex.h 
 calcGUI.o: calcGUI.c calcGUI.h calculator.h 
 calculCplx.o: calculCplx.c calcGUI.h

 calculCplx: calculCplx.o complex.o calculator.o calcGUI.o
     $(LINK.cpp) -o $@ $^ -lgraph

or even:

 CFLAGS = -std=c17 -pedantic
 CFLAGS += -O2
 CFLAGS += -g 
 LDLIBS = -lgraph

 all: calculCplx

 complex.o: complex.c complex.h
 calculator.o: calculator.c calculator.h complex.h 
 calcGUI.o: calcGUI.c calcGUI.h calculator.h 
 calculCplx.o: calculCplx.c calcGUI.h

 calculCplx: calculCplx.o complex.o calculator.o calcGUI.o

where we have now completely removed the command associated with the last target (executable production).

Line breaks

When an element (variable definition, list of target dependencies, commands, ... and even a comment, although this is not recommended) is too long to reasonably fit on one line, it is possible to place a line break by telling make to consider the next line as a continuation of the previous one.

This is achieved by placing the \ character at the end of the line to be extend:

# here's a comment \
    on two lines

all: dep1 \
     dep2
    @echo "target 'all' done"

dep1:
    @echo "dependency 1 completed"

dep2:
    @echo "dependency 2 ok..." \
"indeed!"

Example of execution:

$> make

dependency 1 completed
dependency 2 ok... indeed!

target 'all' done

This example shows that clumsy use of this option can considerably impair the readability of the Makefile.

To find out more

Despite the name of the previous section, we're still a long way off the possibilities of make.

For those who would like to know even more, don't hesitate to consult the following references (all external):

GNU make website](http://www.gnu.org/software/make/)
The (GNU)make manual, taken from the previous site](http://www.gnu.org/software/make/manual/make.html)

Finally, please note that there are many more modern redesigns of development project management tools (CMake, SCons, GNU autotools,tools integrated into IDEs: KDevelop, Anjunta, NetBeans, Code::Blocks, ...), but we feel that a good knowledge of the make is a real bonus to your programmer CV.