Hello and welcome to our todays article on Linux system exploration and troubleshooting tool Sysdig with first class support for containers. Sysdig capture system state and activity from a running Linux instance, then save, filter and analyze. You can use this awesome tool as a replacement of many Linux troubleshooting commands like top, lsof, strace, iostat, ps, etc. It also combines the benefits of many utilities such as strace, tcpdump, and lsof into one single application which is packed with a set of scripts called Chisels that make it easier to extract useful information and do troubleshooting.
In this article we’ll show you its installation steps and basic usage of sysdig to perform system monitoring and troubleshooting on Linux CentOS 7 and Ubuntu 15 Operating system.
1) Installing Sysdig on Ubuntu 15:
Sysdig included the latest versions of Debian , RHEL and Container based OS; however, it is updated with new functionality all the time. We are going to install Sysdig using 'apt' command, but first we need to setup the apt repository maintained by Draios by running the following 'curl' commands with root user.
Using below commands will trust the Draios GPG key and configure the apt repository.
# curl -s https://s3.amazonaws.com/download.draios.com/DRAIOS-GPG-KEY.public | apt-key add -
# curl -s -o /etc/apt/sources.list.d/draios.list http://download.draios.com/stable/deb/draios.list
Now you need to update the package list by executing the following command.
# apt-get update
Once your system update is complete you need to install the kernel headers package using the command as shown below.
# apt-get -y install linux-headers-$(uname -r)
Now you can install sysdig on ubuntu using the following command.
# apt-get -y install sysdig
2) Installing Sysdig on CentOS 7
The installation process on the CentOS 7 is similar to the one that we have performed for Ubuntu server but you need to repeat the same step by setting up the yum repository that will use its own key to verify the authenticity of the package.
Let's run the following command to use the 'rpm' tool with the '--import' to manually add the Draios key to your RPM key.
# rpm --import https://s3.amazonaws.com/download.draios.com/DRAIOS-GPG-KEY.public
After this download the Draios repository and configure yum to use it on your CentOS 7 server.
# curl -s -o /etc/yum.repos.d/draios.repo http://download.draios.com/stable/rpm/draios.repo
Now you need to update the package list by executing the following command before starting installation of Sysdig package.
# yum update
The EPEL repository is needed in order to download the Dynamic Kernel Module Support (DKMS) package used by sysdig tool. So, the following below commands to enable EPEL repository.
# yum -y install epel-release
Now install the kernel headers in order to setup sysdig-probe module and then flow the command to install the Sysdig package on the server.
# yum install kernel-devel-$(uname -r)
# yum install sysdig
3) Using Sysdig
After successful installation of sysdig tool, now we will show you some of its most useful examples to use this for troubleshooting your system. The simplest and easiest method to use sysdig is by invoking it without any argument as shown below.
By default, sysdig prints the information for each captured event on a single line in the format of its event number, event time, event cpu number, name of the process (PID), event direction for out, event type and event arguments.
The output is so much huge and mostly not very useful by itself, so you can write the output of the Sysdig in a file by using the '-w' flag and specifying the file name in '.dump' as shown in below command.
# sysdig -w result.dump
Ten run the following command with parameter '-r' to read the output from the saved file.
# sysdig -r result.dump
You can use filters that allow you to filter the output of sysdig results to specific information. run the following command to find a list of available filters as shown.
# sysdig -l
---------------------- Field Class: fd fd.num the unique number identifying the file descriptor. fd.cport for TCP/UDP FDs, the client port. fd.rproto for TCP/UDP FDs, the remote protocol. ---------------------- Field Class: process proc.pid the id of the process generating the event. proc.name the name (excluding the path) of the executable generating the event. proc.args the arguments passed on the command line when starting the proc ess generating the event. proc.env the environment variables of the process generating the event. proc.cmdline full process command line, i.e. proc.name + proc.args. proc.exeline full process command line, with exe as first argument, i.e. pro c.exe + proc.args. proc.cwd the current working directory of the event. proc.duration number of nanoseconds since the process started. proc.fdlimit maximum number of FDs the process can open. proc.fdusage the ratio between open FDs and maximum available FDs for the pr ocess. . thread.pfminor number of minor page faults since thread start. thread.ismain 'true' if the thread generating the event is the main one in th e process.
So you can filter the results using its powerful filtering system. You can use the “proc.name” filter to capture all of the sysdig events for a specific process.
Let's for example filter the process of 'MySQLD' using proc.name argument using below command.
# sysdig -r result.dump proc.name=mysqld
140630 02:20:30.848284977 2 mysqld (2899) io_getevents 140632 02:20:30.848289674 2 mysqld (2899) > switch next=2894(mysqld) pgft_maj=0 pgft_min=1 vm_size=841372 vm_rss=85900 vm_swap=0 140633 02:20:30.848292784 2 mysqld (2894) io_getevents 140635 02:20:30.848297142 2 mysqld (2894) > switch next=2901(mysqld) pgft_maj=0 pgft_min=4 vm_size=841372 vm_rss=85900 vm_swap=0 140636 02:20:30.848300414 2 mysqld (2901) io_getevents 140638 02:20:30.848307954 2 mysqld (2901) > switch next=0 pgft_maj=0 pgft_min=1 vm_size=841372 vm_rss=85900 vm_swap=0 140640 02:20:30.849340499 1 mysqld (2900) io_getevents 140642 02:20:30.849348907 1 mysqld (2900) > switch next=2895(mysqld) pgft_maj=0 pgft_min=1 vm_size=841372 vm_rss=85900 vm_swap=0 140643 02:20:30.849357633 1 mysqld (2895) io_getevents 140645 02:20:30.849362258 1 mysqld (2895) > switch next=26329(tuned) pgft_maj=0 pgft_min=1 vm_size=841372 vm_rss=85900 vm_swap=0 140702 02:20:30.995763869 1 mysqld (2898) io_getevents 140704 02:20:30.995777232 1 mysqld (2898) > switch next=2893(mysqld) pgft_maj=0 pgft_min=1 vm_size=841372 vm_rss=85900 vm_swap=0 140705 02:20:30.995782563 1 mysqld (2893) io_getevents 140707 02:20:30.995795720 1 mysqld (2893) > switch next=0 pgft_maj=0 pgft_min=3 vm_size=841372 vm_rss=85900 vm_swap=0 140840 02:20:31.204456822 1 mysqld (2933) futex addr=7F1453334D50 op=129(FUTEX_PRIVATE_FLAG|FUTEX_WAKE) val=1 140842 02:20:31.204464336 1 mysqld (2933) futex addr=7F1453334D8C op=393(FUTEX_CLOCK_REALTIME|FUTEX_PRIVATE_FLAG|FUTEX_WAIT_BITSET) val=12395 140844 02:20:31.204569972 1 mysqld (2933) > switch next=3920 pgft_maj=0 pgft_min=1 vm_size=841372 vm_rss=85900 vm_swap=0 140875 02:20:31.348405663 2 mysqld (2897) io_getevents
To filter the live process of 'sshd' you can use following command with proc.name argument.
# sysdig proc.name=sshd
Network and System Diagnosing with Sysdig
To see the top processes in terms of network bandwidth usage run the below sysdig command.
# sysdig -c topprocs_net
Bytes Process PID -------------------------------------------------------------------------------- 304B sshd 3194
To capture all processes that open a specific file, use below command.
# sysdig fd.name=/var/log
In order to capturing all processes that open a specific file system you can use the following command. Use comparison operators with filters such as contains, =, !=, =, . You will see that filters can be used for both reading from a file or the live event stream.
# sysdig fd.name contains /etc
Using Chisels in Sysdig
Sysdig’s chisels are little scripts that analyze the sysdig event stream to perform useful actions. If you’ve used system tracing tools like dtrace, you’re probably familiar with running scripts that trace OS events. Chisels work well on live systems, but can also be used with trace files for offline analysis.
To get the list of available chisels, just type the following command to get a short description for each of the available chisels.
# sysdig -cl
To run one of the chisels, you use the '–c' flag. For instance, let’s run the topfiles_bytes chisel as shown below.
# sysdig -c topfiles_bytes
Or if you want to see the top files in a specific folder then use below command.
# sysdig -c topfiles_bytes "fd.name contains /root"
To see the top files by a specific user use below.
# sysdig -c topfiles_bytes "user.name=admin"
Thank you for reading this detailed article and I hope you have found this much helpful as your favorite system and network diagnosing tool. There are still a lot more features that you can explore using sysdig. Don't forget to share with us about your finding and leave us your valuable comments.