Wednesday, October 27, 2010

Understanding Linux system performance management using top

If there's something wrong with the performance of your Linux server, chances are that you're already using top to find out what's happening. It seems however that few people really know how to tell what their system is doing from the information that top provides. Here I will explain how to understand the performance data that top provides.

When starting top, make sure that you are in root. To start, open a console on your favorite Linux distribution and enter the top command. The result should look similar to this:

host:~ # top
top - 12:41:34 up 1 day,  3:29,  6 users,  load average: 0.00, 0.00, 0.00
Tasks:  99 total,   1 running,  98 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.1%us,  0.1%sy,  0.0%ni, 99.7%id,  0.1%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:    775064k total,   560056k used,   215008k free,   136216k buffers
Swap:   136544k total,        0k used,   136544k free,   275624k cached
   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 8830 root      15   0  2232  936  692 R    2  0.1   0:00.03 top
       1 root   16   0   728  284  244 S    0  0.0   0:01.25 init
       2 root   RT   0      0       0      0 S    0  0.0   0:00.10 migration/0
       3 root   34  19      0       0      0 S    0  0.0   0:00.34 ksoftirqd/0
       4 root   RT   0      0       0      0 S    0  0.0   0:00.04 migration/1
       5 root   34  19      0       0      0 S    0  0.0   0:00.00 ksoftirqd/1
       6 root   10  -5      0       0      0 S    0  0.0   0:00.33 events/0
       7 root   10  -5      0       0      0 S    0  0.0   0:00.10 events/1
       8 root   11  -5      0       0      0 S    0  0.0   0:00.01 khelper
       9 root   12  -5      0       0      0 S    0  0.0   0:00.00 kthread
      13 root   10  -5      0       0      0 S    0  0.0   0:00.13 kblockd/0
      14 root   10  -5      0       0      0 S    0  0.0   0:00.08 kblockd/1
      15 root   13  -5      0       0      0 S    0  0.0   0:00.00 kacpid
      16 root   13  -5      0       0      0 S    0  0.0   0:00.00 kacpi_notify
     227 root   20   0      0       0      0 S    0  0.0   0:00.00 pdflush
     228 root   15   0      0       0      0 S    0  0.0   0:00.93 pdflush
     229 root   17   0      0       0      0 S    0  0.0   0:00.00 kswapd0


The first part of relevant information that top provides can be found in the first line: the load average parameters. These describe how busy your computer is at the moment. The average workload of your server is always given in three digits. Each represents the load average for the last minute, the last five minutes and the last fifteen minutes. You should always start by interpreting these numbers, as they tell you if your system is overloaded or not.

To understand the load average values, you must relate them to the number of CPU's or CPU cores in your computer. If you're not sure, just press the 1 button when the top interface is active, this will give you a line for each CPU core that is present in your computer. When a CPU core has been completely busy in the last minute, top will show you 1.00 if it's a one core system. If you have eight cores installed in your computer, and one has been completely busy, while the others were doing nothing, top will show you 0.125 as the value in the load average. In order to interpret the value in the load average lines, you need to know the normal value for your server. For instance, on a four-core machine, that would be 4.00. Anything above that value is bad, as it indicates that queuing occurs and processes are waiting for their slice of system time. Anything below this value is good. If your system is getting beyond the ideal value for that system, the next step is to determine what exactly is happening. Listing 2 gives an example of a one-core system that is too busy:

top - 12:49:38 up 1 day,  3:37,  6 users,  load average: 1.37, 0.34, 0.11
Tasks: 101 total,   4 running,  97 sleeping,   0 stopped,   0 zombie
Cpu(s):  7.1%us, 16.7%sy,  0.0%ni,  6.2%id, 67.3%wa,  0.4%hi,  2.4%si,  0.0%st
Mem:    775064k total,   767664k used,     7400k free,   514236k buffers
Swap:   136544k total,        0k used,   136544k free,   102744k cached
   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 8859 root      18   0  1788  524  448 R   46  0.1   0:21.25 cat
 8860 root      18   0  1792  524  448 R   41  0.1   0:18.83 cat
  229 root      15   0     0    0    0 S    9  0.0   0:01.54 kswapd0
 3695 root      16   0  1864  700  616 S    3  0.1   0:16.00 hald-addon-stor
 4580 root      16   0 95916  14m  11m S    3  1.9   0:36.44 main-menu
 4552 root      15   0  101m  24m  17m S    2  3.2   0:13.44 nautilus
   13 root      10  -5     0    0    0 S    1  0.0   0:00.70 kblockd/0
 4270 root      15   0  146m  12m 5716 S    1  1.7   1:01.66 X
 4536 root      15   0 29012 9688 7896 S    1  1.2   0:05.47 gnome-settings-
 4578 root      16   0 18136 5500 4124 S    1  0.7   0:14.87 gnome-power-man
 8861 root      16   0  2236 1036  780 R    1  0.1   0:00.64 top
   14 root      10  -5     0    0    0 S    1  0.0   0:00.49 kblockd/1
 4575 root      15   0 95736  13m  10m S    1  1.8   0:07.39 application-bro
 3131 root      16   0  4692 3244 1444 S    1  0.4   0:34.57 hald
 4586 root      15   0 93868  11m 9796 S    1  1.6   0:06.62 mixer_applet2
  228 root      15   0     0    0    0 S    0  0.0   0:00.95 pdflush


If the workload is getting too high, you need to find out what is happening. To do this, you have to look at the CPU line(s). You will see no less than eight different parameters, and of these, only three really matter. First is the "us" parameter. This indicates the amount of time your system is busy handling requests that were made in user space. If a task is not in user space, it is a high-privileged task that runs in system space, which you can see reflected in the "sy" parameter. In kernel space, processes can communicate directly to the drivers. Therefore, you should worry more if your system gives a high load in system space. The third parameter that is important in the CPU line is "wa." This stands for waiting, and indicates the amount of time your system waits for I/O-devices. A high parameter here indicates a problem on the I/O-channel, normally this is a hard disk that is too slow or a misconfigured network.

The second listing example shows that the system is way too busy waiting for I/O. This is far too common, many times system performance problems are related to slow I/O devices. One solution is to install a faster hard drive, but before doing that, it is a good idea to check the BIOS of your server and see if there are parameters that you can tune. One of the most important candidates for that, is the write cache parameter. By writing data to write cache before writing it to the disk platters themselves, you can dramatically reduce waiting times. Since write cache is about 1,000 times as fast as the hard disk, chances are that you can win a lot by enabling this feature.

Use top to reveal memory efficiency on Linux servers 


Apart from the information on how busy your system's CPU is, top also shows you how memory-efficient your server is. You can find information about this in the lines that start with Mem: and Swap:. Let's start discussing swap. This is RAM that is emulated on the hard drive that your computer should never use. There are some exceptions though: if your server runs Oracle, SAP or any other specific application that is built to use swap. But normally, Linux starts swapping only if it is totally out of normal memory. In an exception, your server could pre-allocate some swap so that it can use it faster if it's needed. But in most cases, you should install new RAM on a server that starts swapping.

After you have verified that your system isn't swapping, you should find out what it is doing with available memory. To understand memory, you should know that Linux uses memory quite efficiently. If there's no real need for memory to service processes, it will be used as read cache or write buffers. The read cache contains files that were recently read from your computer's hard drive. The kernel just keeps them in RAM, because you might need them again and if you do, it's a lot faster to serve these files from RAM than from hard disk. The write buffers on the other hand, are used as a waiting room for your server's hard drive. Instead of offering data directly to the hard drive, the operating system places them in the write buffers where they can wait until the hard disk decides it has time to flush these write buffers (e.g., writes them to disk). This also gives you a performance benefit.

The nice thing about read cache and write buffers is that the operating system can make them available instantaneously when it needs memory. Therefore, you should add the read cache and write buffers to the total amount of available memory. A nice way of doing this, is by using the free -m command. On the +/- buffers/cache line, you can see how much free memory your computer really has.

host:~ # free -m
        total       used       free     shared    buffers     cached
Mem:          756         746          10             0         595         40
-/+ buffers/cache:    110        646
Swap:          133             0        133


As you can see in this listing, at first sight it looks as if this server almost has no more available memory, but if you know that buffers can be flushed immediately, you can see that it has largely enough available memory.

Determining active processes on a Linux server

The last interesting part of top is where it shows the most active process on your server. This is not hard to determine: the most active process is listed first on the process list. If this process uses too much system resources, top offers some options to handle it. You can terminate it by pressing the k-key from the top interface. Top will then ask you what signal you want to send to this process. You should always try signal 15 first, this represents the nice way of asking the process to please stop its activity. If that doesn't work, use signal 9, which just terminates the process without further waiting.

Another way of taming a process, is by "renicing" it – i.e., adjust the priority the process is using. To do this, press the r-key from the top interface. By giving a process a negative nice value, you increase the priority with regard to other processes. By assigning a positive nice value, you give more room for the other processes. The values you can use are between -20 and 19. It's a good idea not to assign the value of -20. By doing this, you would give the highest possible priority to a process, thus allowing it to leave no time for the other processes (if it is a busy process).

No comments:

Post a Comment