Tuesday, October 26, 2010

Keep an eye on Linux processes with ps

On Microsoft Windows, the Task Manager is responsible for reporting on and managing the running processes and applications on a server. Linux servers have a Task Bar (or equivalent) when they are viewed from desktops such as GNOME and KDE, but the main tools for process management are command-line ones. Here we'll discuss how to see what's running on a Linux server.

On Linux, processes have owners. Any user can view all the current processes, but only the root user has full management of those processes. On Linux every process runs in its own memory space and can't accidentally overwrite other processes. A process that crashes or dies due to a bug also can't ruin any other process. It's still possible that a process may cope poorly with the loss of another process on which it depends. The Linux kernel itself is not a process; it is special. It is the program that brings into being a playpen inside of which all processes must play.

The central tool used to display processes is the command line tool "ps". Type "man ps" for full details. "ps" stands for "process status". When run, it queries the Linux kernel for the current state of the machine, and based on the snapshot returned, outputs a formatted report. The plainest use of ps shows processes owned by the current Linux user.

In the normal output, CMD states the name of the program that the process is a currently running example of, and TIME is the hours, minutes and seconds of CPU time consumed by that program during that run. PID is the unique identifier for that process.

To see the processes owned by another user, try:

ps -u fred


where "fred" is a Linux /etc/passwd user name. For more detail, the long (-l) or full (-f) options can be used; but the same processes are discovered. Be careful concatenating options; "ps -lufred" is "ps -l -u fred", but "ps -ulfred" is "ps -u lfred"

Each process has both an identifying process ID (a PID), and a parent process ID (a PPID). The parent process is the process that was used to "spawn" (to start) the other process. Usually, the PPID is a command-line shell such as ksh or bash, but it can also be a server such as httpd or inetd.

PID numbers start at one and go up, so processes started more recently have higher numbers. If a process has PPID=1, that means it was either started by "init," the first process that runs when Linux starts up, or (rarely) it has partially failed and has lost its parent process. Such part-failed processes are called "zombies".

A very common use of ps is the command:

ps -ef


which reports every process running, regardless of owner. Also, ps output is highly customizable; the -o option can be used to report exactly the columns you want. Be sure to set and export the COLUMNS environment variable if you want a very wide report, eg:

COLUMNS=300; export COLUMNS


Finally, the -H option, usually used with -e, shows all the processes stacked in a Windows Explorer-like hierarchy. This gives you visual cues (indenting) showing relationships between PPIDs and PIDs.

Running ps hundreds of times a second is bad for general performance, because the kernel must spend time reporting on itself. To watch the process list change in real time, most Linux versions include the program 'top'. You'll find that top acts like ps repeated at fastest once a second. It sorts the output so that the most CPU-intensive options are, well, at the top. The sort order is configurable. Beware that top (and many other programs) expect that you are using a quality terminal emulation in your command window. Usually xterm is enough. Microsoft Window's simple telnet client is too basic. Try installing the free Windows tool PuTTY if you must telnet to Linux from Windows.

Finally, ps and top do their job by exploiting a special interface to the kernel. Many GUI and command-line tools exist that also exploit this interface. For a quick hands-on assessment of a server's condition, simple ps is usually the safest way to go.

No comments:

Post a Comment