I am reading Jesse Storimer’s fantastic little book “Working with Unix Processes” right now, and inspiration struck after the second chapter “Processes Have Parents”.

When a Unix process is born, it is a literal copy of it’s parent process. For example, if I am typing ls into a bash prompt, the bash process spawns a copy of itself using the fork system call. The parent process (bash) has an id which is associated with the child process (ls). Using the Unix ps command, you can see the parent process id of every process on the system.

The only process that has no parent is sched, it has process id zero. The idea I had was to make a visualization of this branching tree of Unix processes. I am currently running Debian GNU/Linux, a Unix-like operating system. I came up with this one-liner that shows the (parent id -> child id) relation:

1
ps axo ppid,pid | sed "s/\b / -> /g" | grep -v "PID"

The first part calls ps and gets all process ids, and their parents. Some sample output is this:

1
2
3
4
5
6
7
8
9
10
11
~ > ps axo ppid,pid
 PPID   PID
    0     1
    0     2
    2     3
    2     6
    2     7
    2     8
    2    10
    2    12
    2    13

This output is piped into sed (stream editor), and the empty space between the numbers is replaced with an arrow “->”:

1
2
3
4
5
6
7
8
9
10
11
12
~ > ps axo ppid,pid | sed "s/\b / -> /g"
 PPID ->   PID
    0 ->     1
    0 ->     2
    2 ->     3
    2 ->     6
    2 ->     7
    2 ->     8
    2 ->    10
    2 ->    12
    2 ->    13
...

PPID is Parent Process Id, and PID is just Process Id. Finally, I use grep -v “PID” to let all the lines through that don’t contain “PID”. This selects those lines that are actual process relations.

In this case, it just chops off the first line. Next, I wanted to convert this into a file that I can feed into GraphViz, an open source graph visualization tool. The format is pretty simple, an example is in order:

1
2
3
4
digraph Foo {
  1 -> 2
  1 -> 3
}

The above file defines a graph called “Foo” that has three nodes and two edges, it looks like this:

Now, all we have to do to the PPID->PID output above is to wrap it in braces and prepend two words to the beginning.

We can use echo “digraph proc { SOME COMMAND }” to wrap the output of our command, then dump the results in a file.

1
echo "digraph proc { `ps axo ppid,pid | sed "s/\b / -> /g" | grep -v "PID"` } " >> proc.dot

Finally, GraphViz has several commands for rendering graphs in various ways. The first thing I tried was a symmetric layout, but that produced a hierarchical, very wide image. So I tryed circo which produces a radial layout:

1
2
~ > echo "digraph proc { `ps axo ppid,pid | sed "s/\b / -> /g" | grep -v "PID"` } " >> proc.dot
~ > circo proc.dot -Tpng >> radial_proc.png

Here’s the radial layout:

You can see the original ancestor of all processes, sched with PID 0 right in the center, then PID 1 which is called init has a bunch of children. I am writing this post in vim in a bash shell in a gnome terminal emulator, the vim PID is 14819, but it is hard to see in this image, there is too much overlap.

Fortunately, we can modify the proc.dot file and include overlap=false right above the PPID->PID pairs. Also, I found from the man pages for the graphviz tools that the splines=true option will draw the edges as splines (curves) instead of straight lines. Also, instead of using circo, there is another tool called neato that will render a more symmetrical graph than circo.

This rendering took much longer than circo rendering, but is much nicer (click to enlarge):

I remember learning in my C programming class that Unix processes all had to be made with fork. It reminded me of asexual reproduction where two identical copies are made. I look forward to learning more about the Unix process model, and recommend Jesse’s book.

Comments