Debugging and Profiling #
Lecture source: https://missing.csail.mit.edu/2020/debugging-profiling/
Syscalls #
Use strace to analyze what syscalls a program may used.
$ sudo strace -e lstat ls -l / > /dev/null
lstat("/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("swapfile", {st_mode=S_IFREG|0600, st_size=1989529600, ...}) = 0
lstat("bin", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("lib64", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("vmlinuz", {st_mode=S_IFLNK|0777, st_size=31, ...}) = 0
lstat("initrd.img", {st_mode=S_IFLNK|0777, st_size=34, ...}) = 0
lstat("home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("proc", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
lstat("var", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("sys", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
lstat("dev", {st_mode=S_IFDIR|0755, st_size=3800, ...}) = 0
lstat("root", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
lstat("initrd.img.old", {st_mode=S_IFLNK|0777, st_size=33, ...}) = 0
lstat("etc", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("run", {st_mode=S_IFDIR|0755, st_size=1020, ...}) = 0
lstat("usr", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("media", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("lib", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("lost+found", {st_mode=S_IFDIR|0700, st_size=16384, ...}) = 0
lstat("opt", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0
lstat("sbin", {st_mode=S_IFDIR|0755, st_size=12288, ...}) = 0
lstat("snap", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("vmlinuz.old", {st_mode=S_IFLNK|0777, st_size=30, ...}) = 0
lstat("boot", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("mnt", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("srv", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
+++ exited with 0 +++
Profilers #
Use line_profiler with kernprof to get the running time on every lines in the Python code.
You can also get this from Python’s
cProfile.
Use memory-profiler to investigate the memory usage for each line of the Python code.
Use perf to do the event profiling, it has a lot of traceable events, dig it and use it!
Use FlameGraph to visualize the profiling data (from perf etc.)
If you are with Python, you can use the pycallgraph to visualize the Python funciton call graph with some of profiling information.
If you are only interest in code function call graph, give
code2flowa try.
Resource Monitoring #
IMO, this is the critical part for all the profiling, because profiling is all about the resource, and the monitoring is the begining of the profiling.
General monitoring:
I/O:
Disk usage:
Memory:
Open files (file descriptors):
lsof: list of open files.
Network connections statistics:
ss: usess -sto show sockets usage summary.lsof: sockets are files in Linux, uselsof -i :<port_num>to find the sockets.ip: a tool to show / manipulate routing, network devices, interfaces and tunnels.
Network usage:
+benchmark for cli tools: hyperfine
Specialized Tools #
Use hyperfine to benchmark the CLI (Command-Line Interface) tools.
Exercises #
Debugging #
-
Use
journalctlon Linux orlog showon macOS to get the super user accesses and commands in the last day. If there aren’t any you can execute some harmless commands such assudo lsand check again.
$ journalctl -S "`date -d 'last day' '+%Y-%m-%d'`" | grep "root" -
Do this hands on
pdbtutorial to familiarize yourself with the commands. For a more in depth tutorial read this.
TODO
-
Install
shellcheckand try checking the following script. What is wrong with the code? Fix it. Install a linter plugin in your editor so you can get your warnings automatically.#!/bin/sh ## Example: a typical script with several problems for f in $(ls *.m3u) do grep -qi hq.*mp3 $f \ && echo -e 'Playlist $f contains a HQ file in mp3 format' done -
(Advanced) Read about reversible debugging and get a simple example working using
rrorRevPDB.
Profiling #
-
Here are some sorting algorithm implementations. Use
cProfileandline_profilerto compare the runtime of insertion sort and quicksort. What is the bottleneck of each algorithm? Use thenmemory_profilerto check the memory consumption, why is insertion sort better? Check now the inplace version of quicksort. Challenge: Useperfto look at the cycle counts and cache hits and misses of each algorithm. -
Here’s some (arguably convoluted) Python code for computing Fibonacci numbers using a function for each number.
#!/usr/bin/env python def fib0(): return 0 def fib1(): return 1 s = """def fib{}(): return fib{}() + fib{}()""" if __name__ == '__main__': for n in range(2, 10): exec(s.format(n, n-1, n-2)) # from functools import lru_cache # for n in range(10): # exec("fib{} = lru_cache(1)(fib{})".format(n, n)) print(eval("fib9()"))Put the code into a file and make it executable. Install prerequisites:
pycallgraphandgraphviz. (If you can rundot, you already have GraphViz.) Run the code as is withpycallgraph graphviz -- ./fib.pyand check thepycallgraph.pngfile. How many times isfib0called?. We can do better than that by memoizing the functions. Uncomment the commented lines and regenerate the images. How many times are we calling eachfibNfunction now? -
A common issue is that a port you want to listen on is already taken by another process. Let’s learn how to discover that process pid. First execute
python -m http.server 4444to start a minimal web server listening on port4444. On a separate terminal runlsof | grep LISTENto print all listening processes and ports. Find that process pid and terminate it by runningkill <PID>. -
Limiting processes resources can be another handy tool in your toolbox. Try running
stress -c 3and visualize the CPU consumption withhtop. Now, executetaskset --cpu-list 0,2 stress -c 3and visualize it. Isstresstaking three CPUs? Why not? Readman taskset. Challenge: achieve the same usingcgroups. Try limiting the memory consumption ofstress -m. -
(Advanced) The command
curl ipinfo.ioperforms a HTTP request and fetches information about your public IP. Open Wireshark and try to sniff the request and reply packets thatcurlsent and received. (Hint: Use thehttpfilter to just watch HTTP packets).