Perf is a performance counter for Linux. With it you can know many secrets of the running linux system. For example why L2 cache misses are happening. Why your disk is showing activity, which code paths are making the stalls in the kernel, and many other things. In this article, we will show basics of perf and how you can use it to get some data about performance of your system.
How to Install perf
Installing perf userpace tools via termial (along with awk) is straightforward.
On Ubuntu type
sudo apt install linux-tools-common gawk
On CentOS and Fedora
sudo yum install perf gawk
Few usage of perf tool
Lets record disk i/o using the below command.
sudo perf record -e block:block_rq_issue -ag
It will record data, you can press ctrl-c to stop it after couple of seconds. Then issue following command
sudo perf report
And you will get something like this:
The plus means that report is expandable and you can see which code paths are responsible. I expanded one line in second image, there we see more info.
Lets next check cache misses. This command will record cpu level 1 data cache misses
sudo perf record -e L1-dcache-load-misses -c 10000 -ag -- sleep 5
So when we check report with sudo perf report -f we will see which code is responsible for the misses
We see that Skype's function is responsible for 0.83% of cache misses in L1 Data cache.
So far we recorded and then analyzed the record, but what if you want it real time, to record and see output immediately? Read on, in next section we cover that.
Perf is very powerful tool, but not at the same time not the best documented tool due to frequent changes to underlying framework. So for easier performance counting in real time, we will use a suite of scripts from github. Lets clone them
cd git clone --depth 1 https://github.com/brendangregg/perf-tools cd perf-tools
And you are all set. Now we run some scripts:
This script will give us latency of the disk as histogram. I have run it just for a second, you can have it run longer. I don't want because I have 5400 rpm HDD, and results just cant be good, so why run it.
Lets try some more scripts.
This script uses perf to get cache misses every second and prints it to a line. Notice that first 3 seconds there were no misses, an latter they started. That is because I started rebuild of my project in Android Studio just at that time. I can say that Intel made pretty good branch predictor in Haswell. Cache hit percentage rarely drops bellow 95% on my i7 with 6 mb of L3 cache.
sudo ./kernel/funccount -t 5 -d 5 'ext4*' Tracing "ext4*" for 5 seconds. Top 5 only... FUNC COUNT ext4_mark_iloc_dirty 109 ext4_reserve_inode_write 109 ext4_get_group_desc 112 ext4_inode_table 112 ext4_journal_check_start 155
This script traces kernel function you type (ext4 in this example) for amount of time you type ( -d 5 sec here) and outputs top list with as many spots you set (-t 5 here).
Some non-realtime scripts
This script uses perf_events to count syscalls :
sudo ./syscount -c Tracing... Ctrl-C to end. ^Csleep: Interrupt SYSCALL COUNT exit 1 lseek 1 newuname 1 dup 2 sched_getaffinity 2 tgkill 2 timerfd_settime 2 unlink 2 access 3 prctl 3 set_robust_list 4 fdatasync 5 getsockopt 5 epoll_ctl 6 newlstat 6 ftruncate 7 munmap 8 shutdown 8 inotify_add_watch 9 bind 10 mmap 14
It is not real time like above one above, you have to press ctrl-C to stop counting and then you get output.
If you want to track call of specific process you first need to get its PID with command
sudo ./syscount -v
and then need to use pid number like this
./syscount -cp 5656
to see which syscalls a process with pid 5656 calls the most.
We have gone through basic examples of what can be done with perf to gauge performance data of your system. But we only scratched the surface, as perf is really extensive tool and you can use it to get many details. Unfortunately we can only go so much in one article. Thank you for reading and good day.