.. _sec-probing-memory: ========================================== Probing memory on a running Linux system ========================================== Motivation and plan =================== The lesson tries to "cleverly" throw in side-by-side examples of Python and C programming to also give some exposure to that. Python programming brings great advantages in life, and C programming even deeper ones. I will give a very small conceptual explanation of memory, and then see if the lesson here can turn that into a clear understanding. Much of this might apply to other UNIX systems - it's an interesting exercise to try it on a `*BSD` system. Before starting we will want to have these programs installed: .. code-block:: console sudo apt install gcc procps gnuplot Concepts ======== .. glossary:: physical RAM Fast, plentiful but not as plentiful as disk space, volatile (does not survive power-off). disk space Slow, very plentiful (and, since you can just go buy more or use network-mounted space, it's virtually infinite). dynamic memory allocation A program starts out with some fixed amount of memory allocated to it for what the "run time support" already knows it will need. The program can then ask for more memory while it runs. It can also release that memory. The mechanisms are different for different languages, but at the low level it usually involves calling the ``malloc()`` call to request the memory from the operating system. virtual memory If a program asks for more memory than the computer has in RAM then what happens? This is where virtual memory kicks in. The current program will be given the memory. To allow this a chunk (called a "page") of some other program's memory (or this program's memory!) will be saved off to the swap area on the hard disk. This is called "swapping out a page". Once that page is saved to disk, it's RAM is free to be used by the process that needs it. So what happens when a program needs to use that "swapped out" memory again? It gets "swapped in", returning to memory. The virtual memory management keeps track of every "page" of memory that has been "swapped out" and is ready to "swap it back in". It does so in such a way that the application program never needs to see it happen -- it just requests and releases memory. :index:`thrashing` If virtual memory is used too much the computer can end paging crazily, constantly swapping pages of RAM to disk and bringing others back in from RAM. This can make a computer grind to a halt because what should be very fast RAM access has to wait for a bunch of disk activity. It is important to understand that kinds of performance problem in a computer as it starts paging too much. :index:`garbage collection` In high level languages (formerly called "very high level languages", VHLLs) when you create big lists and other objects, you are given the memory automatically. When you don't use an object anymore the languages *run time infrastructure* will free up that memory. This is called *garbage collection.* Simplest programs to look at memory concepts ============================================ This does not give us much insight into what is actually being done with that memory. .. _listing-simple-malloc: .. literalinclude:: simple-malloc.c :language: c :caption: simple-malloc.c - a very simple example of using ``malloc()`` and ``free()`` Using memory incorrectly ======================== .. _listing-mem-trash: .. literalinclude:: mem-trash.c :language: c :caption: mem-trash.c - access invalid memory areas. Lesson: preparing the programs ============================== Programs we will use -------------------- * top (from debian package procps) * vmstat (package procps) * gnuplot (package gnuplot-x11 or gnuplot5-qt or gnuplot5-x11) * C compiler (package gcc) * Python interpreter (package python3 or python) Writing programs to use and release memory ------------------------------------------ There are two accompanying programs: memory-use-py.py and memory-use-c.c which use a given amount of memory for a given amount of time. In Python it's done by allocating a single huge string; in C by calling malloc() and then memset(). .. _listing-memory-use-py: .. literalinclude:: memory-use-py.py :language: python :caption: memory-use-py.py - allocate huge amounts of memory on your computer so that you can see how the system responds. Python version. .. _listing-memory-use-c: .. literalinclude:: memory-use-c.c :language: c :caption: memory-use-c.c - allocate huge amounts of memory on your computer so that you can see how the system responds. C version. Prepare to run them with: .. code-block:: console chmod 755 memory-use-py.py gcc memory-use-c.c -o memory-use-c Then try running the programs for brief runs .. code-block:: console ./memory-use-py.py 100 10 ./memory-use-c 100 10 Lesson: real time monitoring of memory with top =============================================== * Start at least three terminals. * In one termianl type "top" and then type "M" so that the processes are sorted by memory use. * In top focus on the top area where it says "KiB Mem:" and "KiB Swap:", as well as the top of the individual process section, where the highest RAM processes are used. Look at the "VIRT" and "RES" columns. * We will need to run pretty big memory takers to rise above the bloat of the web browser and other programs! So in the second terminal run: .. code-block:: console ./memory-use-py.py 3000 40 * In the third terminal run: .. code-block:: console ./memory-use-c 3000 40 * Watch how they evolve in "top". Did they rise near the top of the memory use? * And did you come close to using up all physical RAM? That's what the "KiB Mem:" and "KiB Swap:" lines can tell you. * Now see if you can make your system thrash. Long term monitoring of memory with vmstat ========================================== #. run: .. code-block:: console vmstat -t 1 | tee vmstat.out The ``-t`` option puts a date and time stamp at the end of each line. .. note:: If you are running this on a different UNIX-like system, like FreeBSD or MacOS, the memory monitoring command might be a bit different. If ``vmstat`` does not run, you should try running ``vm_stat`` instead. The column which shows how much virtual memory is in use might be different, so you will need to change the plotting instruction below to plot a different column of numbers.. #. do a bunch of stuff with the memory using programs; make it all quite different and make them last a while #. when you are done type "control-C" in the "vmstat" program #. run: .. code-block:: console $ gnuplot gnuplot> plot 'vmstat.out' using 4 with lines gnuplot> ## and another plot: gnuplot> set multi layout 2,1 gnuplot> plot 'vmstat.out' using 4 with lines gnuplot> plot 'vmstat.out' using 7 with lines IMPROVEME: here's another snippet which I have not yet written up properly: :: gnuplot> set xdata time gnuplot> set timefmt "%Y-%m-%d %H:%M:%S" gnuplot> do for [t=0:50000] { more> plot 'vmstat.out' using 18:4 with lines more> pause 2 more> } You can also couple it with :: pcstat -t 1 | tee pcstat.out and similar plotting stuff, although the time format is different. Advanced memory monitoring with valgrind and massif =================================================== .. code-block:: console valgrind --tool=massif memory-use-c ls -lt massif.out # from here glean the filename massif-visualizer massif.out.PID Further resources ================= Joyce Levine proposes these videos: https://www.youtube.com/watch?v=XV5sRaSVtXQ also this video was helpful for me https://www.youtube.com/watch?v=9wydl0VFmeQ&list=PLEDF53DC200BAF48D&index=2 might be helpful for pointers, might just be confusing though idk https://www.youtube.com/watch?v=a25FQoBhng8&list=PLEDF53DC200BAF48D&index=3