Install atop Linux tool To Monitor Your System Process

August 4, 2014 | By
| Reply More

Atop is an interactive ASCII full-screen performance monitor to view the load on a Linux system. It is capable of showing the occupation  of critical hardware resources like CPU, memory, disk. It also shows which processes are responsible for the indicated load with respect to cpu- and memory load on process level.

This article will guide you though the various usages of atop.

Installing atop

You can get the latest version of atop , they offer it in a rpm package so it's easy to install. You can use wget to download the package and yum to install it like this:

wget http://www.atoptool.nl/download/atop-2.0.2-1.x86_64.rpm
yum install atop-2.0.2-1.x86_64.rpm

atop main window

You can access atop by simply running the command like this:

# atop

And it will display the interactive screen like this:

atop main window

IMPORTANT NOTE: When atop is started it starts the process accounting mechanism in the kernel, this forces the kernel to write a record with accounting information to the accounting file whenever a process ends. This means that you must ALWAYS exit atop by pressing 'q' or with kill -15, otherwise if you stop it with kill -9 or in any other way that doesn't allow it to stop the accounting mechanism it will continue generate a huge file on disk.

System level information

The first part of the window shows system level activity, each system resource is shown on a line and it will be displayed only if it had any activity in the last interval (10 second by default).

atop system level info

The lines have the following meaning:

PRC (process) - shows the total cpu time consumed in system (‘sys’) and user (‘user’) mode, total number of processes running (‘#proc’), total number of threads running (‘#trun’), sleeping interruptible (‘#tslpi’), sleeping uninterruptible (‘#tslpu’), number of zombie processes (‘#zombie’), number of clone system calls (‘clones’) and the number of processes that ended during the interval (‘#exit’)

CPU - contains the percentage of cpu time spent by all active processes (‘sys’), percentage of cpu time consumed in user mode (‘user’), percentage of cpu time spent for interrupt handling (‘irq’), percentage of unused cpu time while no processes were waiting for disk-I/O (‘idle’) and the percentage of unused cpu time while at least one process was waiting for disk-I/O (‘wait’) current frequency (‘curf ’) and the current scaling percentage (‘curscal’).

CPL (CPU load) - shows the load average figures reflecting the number of threads that are available to run on a CPU, these figures are averaged over 1 (‘avg1’), 5 (‘avg5’) and 15 (‘avg15’) minutes, the number of context switches (‘csw’), the number of serviced interrupts (‘intr’) and
the number of available CPUs are shown.

MEM (Memory) - contains the total amount of memory (‘tot’), the amount of memory which is currently
free (‘free’), the amount of memory in use as cache (‘cache’), the amount of memory within the cache that has to be flushed to disk (‘dirty’), the amount of memory used for filesystem meta data (‘buff’), the amount of memory being used for kernel mallocs (‘slab’), the resident size of shared memory including tmpfs (‘shmem‘), the resident size of shared memory (‘shrss‘).

SWP (Swap occupation) - contains the total amount of swap space on disk (‘tot’) and the amount of free swap space (‘free’), the committed virtual memory space (‘vmcom’) and the maximum limit of the committed
space.

LVM/MDD/DSK (Logical volume/multiple device/disk utilization) - each device will produce one line and it will show the time that the unit was busy handling requests (‘busy’), number of read requests issued (‘read’), number of write requests issued (‘write’), number of MiBytes per second throughput for reads (‘MBr/s’), number of MiBytes per second throughput for writes (‘MBw/s’), the average queue depth (‘avq’) and the average number of milliseconds needed by a request (‘avio’) for seek, latency and data transfer.

NET (Network utilization) - it shows one line for TCP/UDP, one line for the IP layer and one line per active interface.

For the TCP/IP it shows the number of received TCP segments (‘tcpi’), number of transmitted TCP segments (‘tcpo’), number of UDP datagrams received (‘udpi’), number of UDP datagrams transmitted (‘udpo’), number of active TCP opens (‘tcpao’), number of passive TCP opens (‘tcppo’), number of TCP output retransmissions (‘tcprs’) and number of TCP input errors (‘tcpie’).

For the IP layer it shows the number of IP datagrams received from interfaces (‘ipi’), number of IP datagrams that local higher-layer protocols offered for transmission (‘ipo’), number of received IP datagrams which were forwarded to other interfaces (‘ipfrw’), number of IP datagrams which were delivered to local higher-layer protocols (‘deliv’), number of received ICMP datagrams (‘icmpi’), and the number of transmitted ICMP datagrams (‘icmpo’).

For every active network interface it shows the name of the interface and its busy percentage in the first column (‘busy’), number of received (‘pcki’) and transmitted packets (‘pcko’), effective amount of bits received (‘si’) and transmitted per second (‘so’), number of collisions (‘coll’), number of received multicast packets (‘mlti’), number of errors while receiving a packet (‘erri’), number of errors while transmitting a packet (‘erro’), number of received (‘drpi’) and transmitted (‘drpo’) packets dropped .

If the screen-width does not allow all of these counters, only a relevant subset is shown.

Process level information

atop process level

In the second part of the application, the processes are shown from which the resource utilization has
changed during the last interval. These processes might have used cpu time or issued disk or network
requests.

You can use a few interactive keys to change the display of the process if you wish to get more detailed information:
g - generic output (default)
m - memory related information
d - disk related information
n - network related information
v - various process characteristics
c - command line of the process
u - process activity accumulated per user

Colors

atop uses colors to indicate that a critical occupation percentage has been reached. A critical occupation percentage means that is likely that this load causes a noticable negative performance influence for applications using this resource.

atop colors

Output

You can also output the information provided by atop to the command line or another file, this is useful if you want to use it in a script or want to see directly some information. To do this you can use the -P switch followed by one of the fields name like this:

atop output

This covers the most uses of atop and as you can see it's a very versatile application that can help you diagnose the system performance at any time.

Filed Under : LINUX HOWTO, MONITORING

Free Linux Ebook to Download

Leave a Reply

Commenting Policy:
Promotion of your products ? Comment gets deleted.
All comments are subject to moderation.