Using Compiled Languages#
First steps with python using numba#
import numpy as np
import numba
Python is a nice scripting object-oriented language but it can run into performance issues. We will see a few examples below. To get better performance, one uses compiled languages, such a C, C++ and Fortran. We will also use the numba python library that allows one to perform “just in time” compilations. We will however explore in more details in this lecture how to compile directly C and Fortran codes. We will see in a next lecture how one can interface these compiled functions directly to python.
Let’s start first with a simple example to see how bad python performs when not used properly. We define a simple function that uses a python loop, which is generally a very bad idea with python.
def f_simple(X):
Y = np.empty_like(X)
for i in range(len(X)):
x = X[i]
Y[i] = x + x**2 + x**3 + x**4 + x**5 + x**6 + x**7 + x**8
return Y
We create a random numpy array of moderate size.
x = np.random.normal(size=1_000_000)
We finally call the function and time it using the %%timeit
python function.
%%timeit
f_simple(x)
1.07 s ± 6.08 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
We see that it called the timing routine 7 times, each timing using only 1 call to the function. We can change this to see how robust the time measurements are. The standard deviation seems indeed a bit large.
%%timeit -r 4 -n 4
f_simple(x)
1.07 s ± 2.26 ms per loop (mean ± std. dev. of 4 runs, 4 loops each)
We see that the measurement seems now more consistent with a smaller standard deviation.
Let’s now try to use more proper python programming, avoiding using explicit loops, but direct numpy array notations instead.
def f_numpy(X):
return X + X**2 + X**3 + X**4 + X**5 + X**6 + X**7 + X**8
%%timeit -r 4 -n 4
f_numpy(x)
107 ms ± 384 μs per loop (mean ± std. dev. of 4 runs, 4 loops each)
Wow! It is indeed much faster. Too fast even… Let’s use a bigger array. For this, on your own jupyter notebook, please uncomment the second line.
x = np.random.normal(size=1_000_000)
# x = np.random.normal(size=10_000_000)
%%timeit -r 4 -n 4
f_numpy(x)
108 ms ± 285 μs per loop (mean ± std. dev. of 4 runs, 4 loops each)
These multiple powers are probably slow to evaluate. Let’s use a nice trick to avoid having to call these expensive operations.
def f_numpy_2(X):
return X * (1 + X * (1 + X * (1 + X * (1 + X * (1 + X * (1 + X * (1 + X)))))))
%%timeit -r 4 -n 4
f_numpy_2(x)
7.48 ms ± 328 μs per loop (mean ± std. dev. of 4 runs, 4 loops each)
Wow! Another dramatic improvement!
We kind of reached the maximum we can do using python alone. We will now try to use a nice python package called numba
that allows one to perform just in time compilation. What numba
does is to first convert the python function into a C code and then to compile this C code on the fly. The performance of the resulting function is usually much higher. Since the function is now compiled, you don’t need to worry about using loops directly anymore. In fact, to allow numba
to translate the python instructions into C instructions, it is recommended to use explicit loops.
Let see how we can optimize our function using numba
.
@numba.jit(nopython=True)
def f_numba(X):
Y = np.empty_like(X)
for i in range(len(X)):
x = X[i]
Y[i] = x + x**2 + x**3 + x**4 + x**5 + x**6 + x**7 + x**8
# Y[i] = x*(1 + x*(1 + x*(1 + x*(1 + x*(1 + x*(1 + x*(1 + x)))))))
return Y
Note that we have used the decorator @numba.jit
that tells numba
to translate the function in C and compile it. numba
tries to translate everything in C. If it cannot, it will keep the python code as is. Using the option nopython=True
forces numba
to translate in C. If numba
fails to do it, an error will follow.
Let’s now time the resulting compiled function.
%%timeit -r 4 -n 4
f_numba(x)
The slowest run took 275.86 times longer than the fastest. This could mean that an intermediate result is being cached.
25.7 ms ± 43.8 ms per loop (mean ± std. dev. of 4 runs, 4 loops each)
This is now really fast! This is the main advantage of using a compiled language. The standard deviation is quite large when compared to the mean. This is because the timer is also counting the extra time numba
needs to compile the function. To avoid this, we can use an even bigger array (again please uncomment the second line in the next cell). Note that we could also have used the cache=True
option of numba
but this is beyond the scope of this lecture.
x = np.random.normal(size=1_000_000)
# x = np.random.normal(size=100_000_000)
%%timeit -r 4 -n 4
f_numba(x)
462 μs ± 133 μs per loop (mean ± std. dev. of 4 runs, 4 loops each)
Using a compiler also allows us to use parallel computing. We will see in future lectures how to program in parallel. For the time being, we just trust numba
to do it for us. To parallelize a numba
function, just add the parallel=True
option and replace the range
function defining the loop by the parallel function numba.prange
which defines the method to divide up the loop into parallel tasks.
@numba.jit(nopython=True, parallel=True)
def f_numba_para(X):
Y = np.empty_like(X)
for i in numba.prange(len(X)):
x = X[i]
Y[i] = x + x**2 + x**3 + x**4 + x**5 + x**6 + x**7 + x**8
return Y
%%timeit -r 4 -n 40
f_numba_para(x)
The slowest run took 57.11 times longer than the fastest. This could mean that an intermediate result is being cached.
2.68 ms ± 4.31 ms per loop (mean ± std. dev. of 4 runs, 40 loops each)
This ends our journey towards better and better performance using python. We started with an explicit loop within a pure python function and the pretty awful timing of roughly 200s. We ended with a compiled parallel C code generated by numba
with an amazing 10000x (yes 10 thousands!) speedup with roughly 20ms of execution time.
Linux and the Terminal window#
To understand better what numba
is doing under the hood, we now switch to using the Terminal window. We will edit our C or Fortran codes using an editor (could be vim
or emacs
). We will then compile our C or Fortran code using a compiler. Before we get there, let’s first get some practice in the Terminal window.
In your jupyter notebook Home page, hit the New button, this time choosing the Terminal option.
You should see a prompt like $
and a cursor. Just type:
$ ls
. You should see the content of the course directory in the Terminal window.
In this jupyter notebook, you can execute all the same command line instructions using a !
before.
!ls
Compiled_Languages.ipynb ddt_example.png profiling_stone_teyssier_slides.pdf
cmake_example debugging.md rust_example
complexLook.jpg profiling.md
Here is a list of very useful Linux commands that you have to know by heart.
Command |
Examples |
Description |
---|---|---|
|
|
List files in current directory |
|
|
Change to parent directory |
|
|
Creating a new directory called |
|
|
Removing the directory called |
|
|
Copy |
|
|
Remove only the file called |
|
|
Move one file into another location and with a new name |
|
|
Look at the file content one page at a time |
|
|
Look at the manual for a given Linux command |
|
|
Search for string |
Try now to play with these different commands in the Terminal window. In the remainder of the lecture, we will have to use the Terminal window again so get used to it!
Compiling a C code#
Now let’s move to the core of the lecture, namely learning how to compile actual code. We will start simple with the famous Hello world
example. The cell below will write a C file called hello.c
.
%%writefile hello.c
#include <stdio.h>
int main() {
// This is a comment
printf("Hello, World!\n");
return 0;
}
Writing hello.c
You can check in parallel in the Terminal window that this file was properly created using the $ more hello.c
command.
The first line is an include statement. It tells the compiler to include at the beginning of the file another file called stdio.h
which is part of the C compiler library of files. As the name indicates, this files contains the standard Input/Output C functions. The function we will use here is printf
to output to screen the character string Hello world!
. Comments are defining in CC using the //
directive.
In this lecture, we will not teach the basics of the C language. We only focus on the compilers. If you need more details on the C syntax, please use the web as a never ending source of information.
To compile the code, we now need to use a compiler. In most Linux system, you always find by default the GNU compiler called gcc
. More resources on the GNU C compiler can be found here. The command to compile our simple hello.c
code is as follows:
!gcc hello.c
This command creates a new file called a.out
which is the executable of your code. You can check that it is indeed here by typing:
!ls
Compiled_Languages.ipynb ddt_example.png profiling_stone_teyssier_slides.pdf
a.out debugging.md rust_example
cmake_example hello.c
complexLook.jpg profiling.md
You can now run this executable by typing:
!./a.out
Hello, World!
Congratulations! You succeeded in running your first compiled code!
In the previous cell, the symbol !
is used to execute from the jupyter notebook a Linux command. In the Terminal window, you can try and execute $ ./a.out
where $
is the prompt (don’t type the $
symbol, it should be already there!). The dot-slash ./
means execute the code that sits here, in this directory.
Note that if you type only $ a.out
it won’t work.
!a.out
/usr/bin/sh: 1: a.out: not found
Indeed, the operating system wasn’t able to find the executable anywhere in the system.
For that, you need to define the PATH
variable that contains the path to your executables.
Try now to type in the Terminal window:
export PATH=~/se-for-sci/content/week09:$PATH
echo $PATH
You should see a long list of directories conrtaining all the executables accessible to you, including:
<yourhomedir>/se-for-sci/content/week09
You can now type a.out
in the Terminal window and it will work like a charm.
Note that this jupyter notebook inherits the PATH
the system had when you launched it the first time with the command jupyter notebook
. You won’t be able to change the path anymore. This is why you need to use the Terminal window for this little exercise we just did.
As you have probably guessed, a.out
is the default name for executables. If you want to give it a proper name, use the -o
option.
!gcc hello.c -o hello
!ls
Compiled_Languages.ipynb ddt_example.png profiling.md
a.out debugging.md profiling_stone_teyssier_slides.pdf
cmake_example hello rust_example
complexLook.jpg hello.c
We have now a new file called hello
which is our new executable.
!./hello
Hello, World!
Let’s now move to a more complicated task. We would like to reproduce the exercise we did using python and numba
but this time directly ourselves using C.
Here is the C code that implements the power function we used before.
To make a proper comparison with our previous attempts, please uncomment the line in the following cell.
%%writefile power.c
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
int main()
{
int i,n=1000000;
// int i,n=100000000;
float x,y;
printf("%i\n",n);
for (i=0;i<n;i++){
x=rand();
y=x+pow(x,2)+pow(x,3)+pow(x,4)+pow(x,5)+pow(x,6)+pow(x,7)+pow(x,8);
}
return 0;
}
Writing power.c
We will not dwelled on the new C syntax introduced here: declaring integer and floating point variables, a for loop and the external functions rand()
and pow()
.
The key points are that we now need to add more include
statements to allow the use of these external functions. The new library element strlib.h
is already contained in the standard GNU C libraries. The library element math.h
is not. We need to tell the compiler to look into an external library to find the pow()
function.
This is done with the compiler using the -l
option that tells the compiler to link your code with an external library of already compiled functions. In our case, the name of the math library is simply m
, so we have to type the command:
!gcc power.c -o power -lm
Note that the outcome of this compilation might depend on your system. Some GNU C compiler versions need the -lm
linking option, some others don’t.
To see what version of the compiler you use, just type:
!gcc --version
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Let’s now try to execute our new code using the Linux command time
:
!time ./power
1000000
0.10user 0.00system 0:00.11elapsed 99%CPU (0avgtext+0avgdata 2048maxresident)k
0inputs+0outputs (0major+86minor)pagefaults 0swaps
This is very disappointing! The reason for this poor performance is the function pow()
that works for any floating point powers, not just integer powers like we need here. Let’s use the same trick we use above for our python code and re-write our C code as follows:
%%writefile mult.c
#include <stdio.h>
#include <stdlib.h>
int main()
{
int i,n=1000000;
// int i,n=100000000;
float x,y;
printf("%i\n",n);
for (i=0;i<n;i++){
x=rand();
y=x*(1+x*(1+x*(1+x*(1+x*(1+x*(1+x*(1+x)))))));
}
return 0;
}
Writing mult.c
Note that we don’t use the math
library anymore. We can compile now using:
!gcc mult.c -o mult
!time ./mult
1000000
0.01user 0.00system 0:00.01elapsed 100%CPU (0avgtext+0avgdata 1408maxresident)k
0inputs+0outputs (0major+72minor)pagefaults 0swaps
Much better! We can also compile the code using the optimization option -O
(capital O) that allows to explore various degree of optimization, from -O0
which corresponds to basically zero optimization to -O3
which allows the compiler to re-write aggressively parts of your code to make it faster.
Let’s try to optimize our executable using:
!gcc -O3 mult.c -o mult
!time ./mult
1000000
0.00user 0.00system 0:00.00elapsed 100%CPU (0avgtext+0avgdata 1408maxresident)k
0inputs+0outputs (0major+71minor)pagefaults 0swaps
Indeed, much better! We have now reached the same level of performance than numba
, but we did it ourselves.
Compiling a Fortran code#
Let’s continue our exploration of compilers and use now a very popular language in scientific computing, namely Fortran.
We start with the Hello world
code. The syntax is widely different than C.
%%writefile hello.f90
program hello
write(*,*)"Hello world!"
end program hello
Writing hello.f90
The most widely distributed Fortran compiler in Linux machines is here also the GNU Fortran compiler, part of the GNU Compiler Collection (gcc). You can find more details here.
The compilation options are very similar than for the C compiler.
!gfortran hello.f90 -o hello
!./hello
Hello world!
Let’s now explore the capabilities of the Fortran compiler for our computational example. We have the following code in Fortran:
%%writefile power.f90
program power
real(kind=8)::x, y
integer::i,n=1000000
! integer::i,n=100000000
write(*,*)n
do i=1,n
x=rand()
y=x*(1+x*(1+x*(1+x*(1+x*(1+x*(1+x*(1+x)))))))
enddo
end program power
Writing power.f90
Let’s compile this code with aggressive optimization turned on.
!gfortran -O3 power.f90 -o power
!time ./power
1000000
0.01user 0.00system 0:00.01elapsed 100%CPU (0avgtext+0avgdata 2560maxresident)k
56inputs+0outputs (1major+112minor)pagefaults 0swaps
The performance is slightly but not significantly better than the C code. This means that choosing between C and Fortran depends more on your personal history and taste, as well as on minor syntax preferences. There are many fundamental differences between C and Fortran: arrays stored contiguously in memory using row-major (Fortran) versus column-major (C) ordering, loop index starting with 0 (C) or 1 (Fortran), etc. But overall, Fortran and C++ have the same language capabilities, especially in terms of object-oriented programming.
Compiling a C++ code#
Finally, let’s quickly explore the C++ compiler. We will first look at the Hello world
code and then at the power calculations. We will use here again the GCC C++ compiler. Note that the file must end with the .cpp
suffix to be considered as a C++ code.
%%writefile hello.cpp
#include <iostream>
int main() {
// This is a comment
std::cout << "Hello World!";
return 0;
}
Writing hello.cpp
!gcc hello.cpp -o hello -lstdc++
!./hello
Hello World!
The syntax is again quite different. The I/O library has a different name.
The code for the computational example now looks like:
%%writefile mult.cpp
#include <iostream>
int main()
{
int i,n=1000000;
// int i,n=100000000;
float x,y;
std::cout << n;
for (i=0;i<n;i++){
x=rand();
y=x*(1+x*(1+x*(1+x*(1+x*(1+x*(1+x*(1+x)))))));
}
return 0;
}
Writing mult.cpp
In order for this code to compile properly, we have to link it with the standard C++ library.
!gcc -O3 mult.cpp -o mult -lstdc++
!time ./mult
10000000.00user 0.00system 0:00.00elapsed 100%CPU (0avgtext+0avgdata 3328maxresident)k
0inputs+0outputs (0major+136minor)pagefaults 0swaps
We get the same performance than C and Fortran.
Building a code with more than one file#
It is not recommended to develop a complex software using only one giant file with millions of lines. Good practice in software engineering favors using multiple files, one per class of objects or functions.
Compiling software with multiple files is slightly more complex than what we did so far.
Let’s decompose our Hello world
code into a main routine and a subroutine, each of which are coded in a separate file.
%%writefile hello.f90
program main
integer :: i=1
call greetings(i)
end program main
Overwriting hello.f90
%%writefile greet.f90
subroutine greetings(i)
integer, intent(in) :: i
write(*,*)"Hello world!",i
end subroutine greetings
Writing greet.f90
We will first compile each individual file using the -c
option of the compiler. This tells the compiler to turn the file and all the functions it contains into an object.
!gfortran -c hello.f90
!gfortran -c greet.f90
We see now using the ls
command in the cell below that we have 2 new files greet.o
and hello.o
. These object files contains independent functions that are not meant to work together just yet.
!ls
Compiled_Languages.ipynb hello power
a.out hello.c power.c
cmake_example hello.cpp power.f90
complexLook.jpg hello.f90 profiling.md
ddt_example.png hello.o profiling_stone_teyssier_slides.pdf
debugging.md mult rust_example
greet.f90 mult.c
greet.o mult.cpp
We now need to link these functions together to get the final executable. The compiler will perform this linking operation using all the required .o
files as arguments. The final executable will be the result of this linking operation and will be given the name hello
using the -o
option.
!gfortran hello.o greet.o -o hello
!./hello
Hello world! 1
Preprocessor directives#
A very popular and convenient way of programming is to use preprocessor directives. Preprocessor directives have their own syntax and can be seen as yet another programming language. As the name indicates, before the compiler actually compiles your code, if the -cpp
option has been used, it will first call the preprocessor. The preprocessor will go through your code and look for directives starting with #
on the first character of each line only. Be careful with this strict rule.
The goal of the preprocessor is to really edit your file before sending it to the compiler. Only the part of the code that are within the blocks that have not been edited by the preprocessor will be compiled. This is useful to get efficient code because you don’t need to perform the tests at run time but only at compilation time.
The example below is quite self-explanatory. More details on preprocessor directives can be found here.
%%writefile greet.f90
subroutine greetings(i)
integer, intent(in) :: i
#ifdef FRENCH
write(*,*)"Bonjour tout le monde !",i
#else
write(*,*)"Hello world!",i
#endif
end subroutine greetings
Overwriting greet.f90
!gfortran -cpp -DFRENCH -c greet.f90
!gfortran hello.o greet.o -o hello
!./hello
Bonjour tout le monde ! 1
Libraries#
We have already used several libraries provided by the system, such as the standard C and C++ libraries or the C math
library. We can also create our own libraries with all our functions. For this, we have to use specific command line instructions. The sequence of commands below will create a library archive or .a
file.
!ar r libgreet.a greet.o
ar: creating libgreet.a
!ranlib libgreet.a
!ls
Compiled_Languages.ipynb hello mult.cpp
a.out hello.c power
cmake_example hello.cpp power.c
complexLook.jpg hello.f90 power.f90
ddt_example.png hello.o profiling.md
debugging.md libgreet.a profiling_stone_teyssier_slides.pdf
greet.f90 mult rust_example
greet.o mult.c
This library contains a bunch of compiled objects that we can link with our codes later. To do so, just use the -L
and -l
options.
The first option with a capital L tells the compiler in what directory it will find the library and the second option with a small l tells the compiler the name of the library, so that -lname
tells the compiler to look for the file libname.a
.
Let’s check that it works.
!gfortran hello.f90 -o hello -L. -lgreet
!./hello
Bonjour tout le monde ! 1
Using PATH
, LIBRARY_PATH
and LD_LIBRARY_PATH
#
We have seen now most of the basics of compiled languages. We have seen in particular how to add to the PATH
environment variable one or more directories that contain our executables.
There is a similar system wide environment variable to tell the system where are our different libraries. The corresponding environment variable is called LIBRARY_PATH
.
In this jupyter notebook, we can see the values these environment variables had when the notebook was started.
!echo $PATH
/home/runner/work/se-for-sci/se-for-sci/.pixi/envs/default/bin:/home/runner/.pixi/bin:/snap/bin:/home/runner/.local/bin:/opt/pipx_bin:/home/runner/.cargo/bin:/home/runner/.config/composer/vendor/bin:/usr/local/.ghcup/bin:/home/runner/.dotnet/tools:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
!echo $LIBRARY_PATH
We see for example that the LIBRARY_PATH
has not been set. In a Terminal window, we can set this variable to our local directory so that we could now access the library libgreet.a
from everywhere, without the need to specify -L.
or -L~/se-for-sci/content/week09
.
For this, just type in the Terminal:
export LIBRARY_PATH=~/se-for-sci/content/week09
echo $LIBRARY_PATH
You can now try to compile the code in the Terminal window using:
gfortran hello.f90 -o hello -lgreet
We don’t need the -L
option anymore because the system knows where to find libgreet.a
.
In most cases, you will use this strategy to link objects from a library into your executable. Note that the resulting executable can be quite big, because it contains all the object .o
files that the linker has extracted from the library. In this case, we say the executable used a static library.
Another strategy consists to load the library objects dynamically at run time. The system will load the content of the so-called shared libraries. The shared library can be built using the -fPIC
and -shared
option as:
gfortran -fPIC -cpp -DFRENCH -c greet.f90
gfortran -shared -o libgreet2.so greet.o
Finally, the executable can be generated using:
gfortran hello.f90 -o hello -L. -lgreet2
Be careful, at run time, the shared loadable libraries must be accessible by the system. You can add your shared dynamic libraries in the environment variable LD_LIBRARY_PATH
:
export LD_LIBRARY_PATH=~/se-for-sci/content/week09
echo $LD_LIBRARY_PATH
You can then execute ./hello
with the properly loaded libraries at run time.
If the LD_LIBRARY_PATH
variable is not set properly, you will get the error:
./hello: error while loading shared libraries: libgreet2.so: cannot open shared object file: No such file or directory
Dealing with complex libraries and compiler versions using module#
In more complex cases, such as large scientific libraries, setting by hand the different PATH
variables can become tedious and error prone. There is a nice tool on most modern Linux servers called module
. It allows you to load (or unload) dynamically the libraries installed on your system by setting up for you all these environment variables.
Since module
will change the environment variables, we cannot use the jupyter notebook. We have to work directly in the Terminal window.
In the Terminal window, type:
$ module list
You should see:
Currently Loaded Modulefiles:
1) anaconda3/2021.11
Type now:
$ module avail
You should see a long list of available modules. If you type
$ module avail fftw
you will get all the modules available for the fftw
fast Fourier transform library.
!module avail fftw
/usr/bin/sh: 1: module: not found
We will now load the library for the gcc compiler version 3.3.9.
For this, type in the Terminal window:
$ module load fftw/gcc/3.3.9
$ module list
You should see:
Currently Loaded Modulefiles:
1) anaconda3/2021.11 2) fftw/gcc/3.3.9
It is worth now inspecting the different environment variables:
$ echo $LIBRARY_PATH
/usr/local/fftw/gcc/3.3.9/lib64
$ echo $LD_LIBRARY_PATH
/usr/local/fftw/gcc/3.3.9/lib64
We see that the Linux environment has been properly set to use the fftw
library.
Inspecting what is in this directory, we get:
$ ls /usr/local/fftw/gcc/3.3.9/lib6
cmake libfftw3f_threads.so.3.6.9 libfftw3_omp.a libfftw3q_threads.so
libfftw3.a libfftw3l.a libfftw3_omp.so libfftw3q_threads.so.3
libfftw3f.a libfftw3l_omp.a libfftw3_omp.so.3 libfftw3q_threads.so.3.6.9
libfftw3f_omp.a libfftw3l_omp.so libfftw3_omp.so.3.6.9 libfftw3.so
libfftw3f_omp.so libfftw3l_omp.so.3 libfftw3q.a libfftw3.so.3
libfftw3f_omp.so.3 libfftw3l_omp.so.3.6.9 libfftw3q_omp.a libfftw3.so.3.6.9
libfftw3f_omp.so.3.6.9 libfftw3l.so libfftw3q_omp.so libfftw3_threads.a
libfftw3f.so libfftw3l.so.3 libfftw3q_omp.so.3 libfftw3_threads.so
libfftw3f.so.3 libfftw3l.so.3.6.9 libfftw3q_omp.so.3.6.9 libfftw3_threads.so.3
libfftw3f.so.3.6.9 libfftw3l_threads.a libfftw3q.so libfftw3_threads.so.3.6.9
libfftw3f_threads.a libfftw3l_threads.so libfftw3q.so.3 pkgconfig
libfftw3f_threads.so libfftw3l_threads.so.3 libfftw3q.so.3.6.9
libfftw3f_threads.so.3 libfftw3l_threads.so.3.6.9 libfftw3q_threads.a
These are quite a few useful libraries we could play with. Note the .a
and .so
file name suffix.
To remove this library, just type:
$ module unload fftw/gcc/3.3.9
You can check yourself in the Terminal window that the environment variables are now back to their original values.
Compiling more complex codes: Makefile and CMake#
Compiling codes can be quite complex when the code base contains thousands of files with many dependencies. A dependency is when a code depends on another library or object to function properly.
When you work on your code, compiling these files can be very time consuming, especially if you ask the compiler to optimize your codes. The trick is to only compile the files that have changes since you last compiled them. Of course, you have to recompile the corresponding object, but also all the objects that depend on it. This is the reason why dependencies are so important.
We have now modern tools that allow to compile complex codes dealing properly with dependencies.
Historically, the first tool to deliver such a service was make
. We will describe it briefly in this course, as you will have to use Makefile
unfortunately. The message here is that make
is depreciated and should be replaced as much as possible with cmake
, the modern version of make
.
Compiling code using make
and the Makefile
#
A Makefile is a file containing specific instructions to compile your code. In a sense, this is yet another programming language but designed only for code compilation. What is nice with make
is the possibility to only recompile files that have changes since the last time it was compiled.
The syntax is quite simple with one major annoying caveat. The general format is as follows:
target : prerequisite
instructions
A target is usually an object or an executable.
A prerequisite is an object or a file that the target depends on.
The lines containing instructions MUST start with a TAB. Spaces are not allowed. This is the most annoying aspect of Makefile. There is however a workaround as shown below using the semi-colon ;
.
%%writefile Makefile
hello : hello.o greet.o Makefile ; gfortran hello.o greet.o -o hello
%.o : %.f90 ; gfortran -cpp -DFRENCH -c $<
clean : ; rm *.o
Writing Makefile
Make sure that the 2 files hello.f90
and greet.f90
are here.
!ls *.f90
greet.f90 hello.f90 power.f90
To execute this Makefile, just type:
$ make
or
$ make hello
in the Terminal window.
In the jupyter notebook, we can also compile the code using:
!make
gfortran hello.o greet.o -o hello
!./hello
Bonjour tout le monde ! 1
Now let’s modify only the file greet.f90
.
!touch greet.f90
!make
gfortran -cpp -DFRENCH -c greet.f90
gfortran hello.o greet.o -o hello
We see that only this file is recompiled. The new executable is also generated by linking the new object.
We can remove all the objects using:
!make clean
rm *.o
and recompile everything using:
!make
gfortran -cpp -DFRENCH -c hello.f90
gfortran -cpp -DFRENCH -c greet.f90
gfortran hello.o greet.o -o hello
We can see a more complex example of Makefile
by cloning the ramses
repository. You can either clone the repository using the following command:
!git clone https://rteyssie@bitbucket.org/rteyssie/ramses.git --quiet
If your Terminal window does not have access to the Internet, connect directly to the corresponding BitBucket web page here using your browser.
Using the Terminal window, try and compile this code using the Makefile
in the ramses/bin
directory.
Compiling code using cmake
#
Makefiles are a system specific build system - they just run commands for you when things need “making”. cmake
was designed to be cross-platform, and is a “build system generator” - it makes your Makefiles for you (or Ninja, or other build systems). It is based on a very high level syntax and can explore your system, looking automatically for libraries and configuring them properly for you. Although the cmake
added value really shows for complex project (or on a different system, like Windows, or if you want to use an IDE, or if you want to pack up and distribute your code, or anything else.), we will start with a simple example.
Let’s assume you have your 2 original Fortran files hello.f90
and greet.f90
.
Let’s create a new file with name CMakeLists.txt
. It. must have this name.
%%writefile CMakeLists.txt
cmake_minimum_required(VERSION 3.18...3.24)
project(hello LANGUAGES Fortran)
add_executable(hello hello.f90 greet.f90)
target_compile_definitions(hello PRIVATE FRENCH)
set_source_files_properties(greet.f90 PROPERTIES Fortran_PREPROCESS TRUE)
Writing CMakeLists.txt
In a Terminal window, type the following to create a directory named build
and configure your project:
$ cmake -S . -B build
This will generate a Makefile (or Ninja, or Xcode, MSVC soluiton, or whatever build system you want). To build, type:
$ cmake --build build
The Makefile
will run and your executable is ready! Just type in there:
$ ./build/hello
We will now walk you through a more complex cmake
example available in the directory cmake_example
.
Additional material can be found in this book:
And this useful workshop: