.. _sec-compiling-and-running-basics: ============================== Compiling and running basics ============================== Motivation and plan =================== This is a review and to make sure we are comfortable with the business of building and running software from source code. As usual I focus on Python and C. Note that although I discuss aspects of the languages (Python and C), the focus here is on how to *build* programs, and the interations with the file system. That's what this hacker's compendium is mostly about: not "pure" programming language information, but rather tricks to be really productive when programming on your operating system. Python ====== You don't need to worry about compiling python code, but you should structure your python programs to execute cleanly. You can do this by having the following structure to your program: .. code-block:: python :caption: simple-program.py -- not much here, just demonstrates a way to set up a ``main()`` function. #! /usr/bin/env python3 def main(): print('this is my main program') ## [all the rest of your main program] ## [other functions] if __name__ == '__main__': main() You should then make the program executable and run it with: .. code-block:: console $ chmod +x simple-program.py $ ./simple-program.py In python you don't have the kind of separate compilation and linking of binary files that you have with languages like C, C++ and FORTRAN, but you can and should have your code distributed between separate files when it starts growing. This offers a nice conceptual separation, and avoids too-long source files. An example could be in the following three files: .. code-block:: python :caption: prog-with-functions.py -- a program which calls two functions which are stored in separate files. #! /usr/bin/env python3 import file_with_f1 import file_with_f2 def main(): print('this is my main program; I will call f1') print('result:', file_with_f1.f1()) print('and now I will call f2') print('result:', file_with_f2.f2()) ## [other functions] if __name__ == '__main__': main() .. code-block:: python :caption: file_with_f1.py -- a module with the function ``f1()`` in it. def f1(): return 42 .. code-block:: python :caption: file_with_f2.c import math def f2(): """f2() returns the sin of PI/6""" return math.sin(math.pi/6.0) ## sin and pi are defined in math Some things to note here: * What's with that snippet: .. code-block:: python [...] if __name__ == '__main__': main() at the end of the file with the main program? You don't need to remember this, but run through it once: if the file is executed directly then the global variable ``__name__`` is set to the string ``'__main__'``, *which means we want to call ``main()``*. That's why we put that litany at the end of python programs. * The filename ``file_with_f1.py`` has hyphens instead of underscores. This is because the hyphen character is the same as a minus sign, so python syntax would *not* let you have something like ``import file-with-f1``. The main program can have hyphens, but if there's any chance that you might some day turn it in to a module and call it from another program. * Once you import the module you call the functions with the syntax MODULE.FUNC(), for example file_with_f1.f1(). The ``import`` instruction has other features so you can abbreviate the name of the module, or even skip it altogether, or selectively import only some functions from the file. * The files ``file_with_f1`` and ``file_with_f2`` need to be in the same directory as the program that calls them. You can put them in a different directory if you add that directory to the python list ``sys.path``. C = Flow of compiling and linking ----------------------------- In C (and C++ and FORTRAN and other languages that are typicall *compiled*) your program will often consist of one or more ``.c`` files. When you first learn to program in C you might have a single file ``my-prog.c``: .. code-block:: c :caption: my-prog.c #include int main() { printf("hello world\n"); return 0; } and you might *compile* it and run it like this: .. code-block:: console $ gcc my-prog.c -o my-prog $ ./my-prog hello world $ But as your program grows bigger you will find that "separate compilation" is a big deal in C: you will often have many files that you compile separately, after which you *link* them together into a single *executable*. The process for a single source file might look like this: .. graphviz:: digraph { { rank=source "create C main\nprogram my-prog.c" } { rank=sink "run it with ./my-prog" } edge [lblstyle="above, sloped"]; "create C main\nprogram my-prog.c" -> "compile it with\ngcc my-prog.c -o my-prog" [label="single C file"]; "compile it with\ngcc my-prog.c -o my-prog" -> "run it with ./my-prog" -> "make changes to my-prog.c" -> "compile it with\ngcc my-prog.c -o my-prog"; } And if you have multiple source files ``my-prog.c``, ``f1.c`` and ``f2.c``, they might look like this: .. code-block:: c :caption: my-prog.c #include extern int f1(); extern double f2(); int main() { printf("function f1() returns %d\n", f1()); printf("function f2() returns %d\n", f2()); return 0; } .. code-block:: c :caption: f1.c /* f1() returns the number 42 */ int f1() { return 42; } .. code-block:: c :caption: f2.c #include /* f2() returns the sin of PI/6 */ double f2() { return sin(M_PI/6.0); /* M_PI is defined in math.h */ } And the process for compiling and linking those multiple source files might look like this: .. graphviz:: digraph { { rank=same "compile *each* file with\ngcc -c file.c" "recompile *just*\nthat .c file with\ngcc -c file.c" } { rank=source "create C main\nprogram my-prog.c\nand files f1.c, f2.c" } { rank=sink "run it with ./my-prog" } edge [lblstyle="above, sloped"]; "create C main\nprogram my-prog.c\nand files f1.c, f2.c" -> "compile *each* file with\ngcc -c file.c" [label="multiple C files"]; "compile *each* file with\ngcc -c file.c" -> "link files together with\ngcc -o my-prog my-prog.o f1.o f2.o -lm" -> "run it with ./my-prog" -> "make changes to a\nsingle .c file" -> "recompile *just*\nthat .c file with\ngcc -c file.c" -> "link files together with\ngcc -o my-prog my-prog.o f1.o f2.o -lm"; } What are libraries and header files? ------------------------------------ Terminology is important to understand what's happening when you compile and link C (and C++) code. Let's start by talking about *libraries*. When you start programming in C you are told to put this line at the top of your program: .. code-block:: c #include and you might have thought in a muddled manner (as I did at first) "aha! that line links to the standard I/O library which gives me functions like printf()!!" This will get you through your initial learning process, but let's try to make that narrative really precise. A somewhat more precise narrative is: Using a library in C involves two parts: one is telling your code what functions are available and what arguments they take, so that you can call them properly. The other is to link the library code with your code. The first part (compile-time) is achieved by *including a header file* in your source code, with instructions like ``#include ``. The second part is achieved when you *link* your source code to the libraries you use. In the command line .. code-block:: bash $ gcc myprog.c -lm the ``-lm`` portion means "link to the library in ``/usr/lib/libm.a``", which is the C math library. So what happens if you forget the ``#include `` or the ``-lm``? Let's try it out. Write a simple program: .. code-block:: C #include #include int main() { double x = sin(M_PI/3); printf("%g\n", x); return 0; } Comment out the ``#include `` and try compiling with ``gcc myprog.c -lm``. You will get a warning to the effect that you have an "implicit declaration of sin" and an error that ``M_PI`` is undeclared. That's because these things are defined in ``math.h``. You can uncomment the ``#include `` and compile again, but this time leave out the ``-lm`` and compile with ``gcc myprog.c``. C: scope and resolving external variables ========================================= .. [not yet written] This is another one of those topics that shifts you from a "beginner learning the syntax of a language" to being a "person who can understand and design larger programs". Programs grow and become complex, and much of the business of software engineering is coming up with ways to tame the complexity of large programs. The first of these steps dates back a very long time: compilation in separate files. When you compile