Debugging and Profiling #
Lecture source: https://missing.csail.mit.edu/2020/debugging-profiling/
Syscalls #
Use strace
to analyze what syscalls a program may used.
$ sudo strace -e lstat ls -l / > /dev/null
lstat("/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("swapfile", {st_mode=S_IFREG|0600, st_size=1989529600, ...}) = 0
lstat("bin", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("lib64", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("vmlinuz", {st_mode=S_IFLNK|0777, st_size=31, ...}) = 0
lstat("initrd.img", {st_mode=S_IFLNK|0777, st_size=34, ...}) = 0
lstat("home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("proc", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
lstat("var", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("sys", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
lstat("dev", {st_mode=S_IFDIR|0755, st_size=3800, ...}) = 0
lstat("root", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
lstat("initrd.img.old", {st_mode=S_IFLNK|0777, st_size=33, ...}) = 0
lstat("etc", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("run", {st_mode=S_IFDIR|0755, st_size=1020, ...}) = 0
lstat("usr", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("media", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("lib", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("lost+found", {st_mode=S_IFDIR|0700, st_size=16384, ...}) = 0
lstat("opt", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0
lstat("sbin", {st_mode=S_IFDIR|0755, st_size=12288, ...}) = 0
lstat("snap", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("vmlinuz.old", {st_mode=S_IFLNK|0777, st_size=30, ...}) = 0
lstat("boot", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("mnt", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("srv", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
+++ exited with 0 +++
Profilers #
Use line_profiler
with kernprof
to get the running time on every lines in the Python code.
You can also get this from Python’s
cProfile
.
Use memory-profiler
to investigate the memory usage for each line of the Python code.
Use perf
to do the event profiling, it has a lot of traceable events, dig it and use it!
Use FlameGraph
to visualize the profiling data (from perf
etc.)
If you are with Python, you can use the pycallgraph
to visualize the Python funciton call graph with some of profiling information.
If you are only interest in code function call graph, give
code2flow
a try.
Resource Monitoring #
IMO, this is the critical part for all the profiling, because profiling is all about the resource, and the monitoring is the begining of the profiling.
General monitoring:
I/O:
Disk usage:
Memory:
Open files (file descriptors):
lsof
: list of open files.
Network connections statistics:
ss
: usess -s
to show sockets usage summary.lsof
: sockets are files in Linux, uselsof -i :<port_num>
to find the sockets.ip
: a tool to show / manipulate routing, network devices, interfaces and tunnels.
Network usage:
+benchmark for cli tools: hyperfine
Specialized Tools #
Use hyperfine
to benchmark the CLI (Command-Line Interface) tools.
Exercises #
Debugging #
-
Use
journalctl
on Linux orlog show
on macOS to get the super user accesses and commands in the last day. If there aren’t any you can execute some harmless commands such assudo ls
and check again.
$ journalctl -S "`date -d 'last day' '+%Y-%m-%d'`" | grep "root"
-
Do this hands on
pdb
tutorial to familiarize yourself with the commands. For a more in depth tutorial read this.
TODO
-
Install
shellcheck
and try checking the following script. What is wrong with the code? Fix it. Install a linter plugin in your editor so you can get your warnings automatically.#!/bin/sh ## Example: a typical script with several problems for f in $(ls *.m3u) do grep -qi hq.*mp3 $f \ && echo -e 'Playlist $f contains a HQ file in mp3 format' done
-
(Advanced) Read about reversible debugging and get a simple example working using
rr
orRevPDB
.
Profiling #
-
Here are some sorting algorithm implementations. Use
cProfile
andline_profiler
to compare the runtime of insertion sort and quicksort. What is the bottleneck of each algorithm? Use thenmemory_profiler
to check the memory consumption, why is insertion sort better? Check now the inplace version of quicksort. Challenge: Useperf
to look at the cycle counts and cache hits and misses of each algorithm. -
Here’s some (arguably convoluted) Python code for computing Fibonacci numbers using a function for each number.
#!/usr/bin/env python def fib0(): return 0 def fib1(): return 1 s = """def fib{}(): return fib{}() + fib{}()""" if __name__ == '__main__': for n in range(2, 10): exec(s.format(n, n-1, n-2)) # from functools import lru_cache # for n in range(10): # exec("fib{} = lru_cache(1)(fib{})".format(n, n)) print(eval("fib9()"))
Put the code into a file and make it executable. Install prerequisites:
pycallgraph
andgraphviz
. (If you can rundot
, you already have GraphViz.) Run the code as is withpycallgraph graphviz -- ./fib.py
and check thepycallgraph.png
file. How many times isfib0
called?. We can do better than that by memoizing the functions. Uncomment the commented lines and regenerate the images. How many times are we calling eachfibN
function now? -
A common issue is that a port you want to listen on is already taken by another process. Let’s learn how to discover that process pid. First execute
python -m http.server 4444
to start a minimal web server listening on port4444
. On a separate terminal runlsof | grep LISTEN
to print all listening processes and ports. Find that process pid and terminate it by runningkill <PID>
. -
Limiting processes resources can be another handy tool in your toolbox. Try running
stress -c 3
and visualize the CPU consumption withhtop
. Now, executetaskset --cpu-list 0,2 stress -c 3
and visualize it. Isstress
taking three CPUs? Why not? Readman taskset
. Challenge: achieve the same usingcgroups
. Try limiting the memory consumption ofstress -m
. -
(Advanced) The command
curl ipinfo.io
performs a HTTP request and fetches information about your public IP. Open Wireshark and try to sniff the request and reply packets thatcurl
sent and received. (Hint: Use thehttp
filter to just watch HTTP packets).