Process creation monitoring

Posted by David N. Welton Mon, 06 Aug 2007 09:53:00 GMT

So, I'm stumped - maybe someone knows the answer to this one. I noticed that on my new Ubuntu system, if I ran ps every 5 seconds or so, the process number jumped significantly, by 5 or 6. Being a bit of a control freak, I want to know exactly what's creating those processes, so I wrote a little Tcl script to try and catch any new [0-9]* directories in proc and read its cmdline file entry, but that seems to not be fast enough to catch the culprit in the act. I had a look at inotify, but that doesn't seem to work with /proc. So at this point, I'm stumped, and am looking around for suggestions. I just want to have a record of newly created processes... seems like it ought to be possible.

13 comments | no trackbacks

Comments

  1. Mikael Olen falk said 28 minutes later:

    You could try getting the info with systemtap http://sourceware.org/systemtap/

  2. Mark Brown said 37 minutes later:

    There are some kernel patches people have been trying to integrate recently to provide this sort of instrumentation. You really do need kernel support to close the races with short lived processes.

    If you have some idea which process is creating new processes you could try ptrace, I expect...

  3. mla@lausch.at said about 1 hour later:

    Each new thread, not only processes increments the PID. use "ps -eLf" to look at the threads(LWP) in a process. you don't find them in /proc. A multithreaded program example:

    "ps -eLf" returns 6564 as the PID and 6565-6580 as LWPs for this pid

    "ls /proc | grep 6565" returns nothing, but "ls /proc/6565" prints the directory listing for this LWP. This means the /proc directory entries are fake and created on demand. Use systemtop as Mikael Olen suggested to instrument exec/clone calls in the kernel.

  4. Adam said about 1 hour later:

    What if you do a "ps" every 10 seconds or so? Does the process number jump by 10 or 12, or still just by 5 or 6?

    If it's still just 5 or 6, then the short-lived processes might be a consequence of running "ps". Not sure why, but it's a possibility. Try running "strace" on your shell as you launch "ps" to see what happens, or "ps" under "strace". "strace" has options to have it only display fork/exec type calls to cut the output down to something manageable.

    Alternatively, write a script to keep "ps"ing in a loop and store the output of each in a file. You'll probably catch one of them after a while as a process that shows up in only one ps list.

  5. Arvin said about 1 hour later:

    Hi,

    You could also try the BSD process accounting facility of the linux kernel (General Setup->BSD Process Accounting) and install the "acct" debian package.

  6. folkert@vanheusden.com said about 2 hours later:

    easy. 'acctail' will give you this information nicely. URL: http://www.vanheusden.com/acctail/ it uses the bsd process accounting facility

  7. David said about 3 hours later:

    The connector reports process events to userspace. It uses the netlink mechanism and your kernel must be built with the following configuration options:

    Device Drivers --->

    Connector - unified userspace<->kernelspace linker --->

    <*> Connector unified userspace <-> kernelspace linker [*] Report process events to userspace

    This option is available since 2.6.15.

  8. Dave Welton said about 4 hours later:

    accttail doesn't do the job, because it's looking at processes, and I think the posters focusing on threads are on the right trail. The only process I can find doing anything is postgresql doing some cleanups, but even turning that off, I get the mystery threads...

    Even so, thanks for pointing it out - it's a neat program that I'll keep in mind for the future.

  9. Dave Welton said about 4 hours later:

    Ok, the solution!

    I looked at how ps -eLF was doing things, and modified my Tcl code accordingly:

    proc main {} {
        set glob {/proc/*/task/[0-9]*}
    
        set procs [lsort [glob $glob]]
    
        while {1} {
            set newprocs [lsort [glob $glob]]
            if { [llength $procs] != [llength $newprocs] } {
                foreach i $newprocs j $procs {
                    if { $i != $j} {
                        puts "New process $i"
                        set fl [open "$i/cmdline"]
                        set data [read $fl]
                        close $fl
                        puts "cmdline: $data"
                        break
                    }
                }
            }
            set procs $newprocs
        }
    }
    
    main
    

    It's quick and dirty, compared with something like systemtap, but requires no kernel recompile or modifications, and was able to find the culprit:

    gnome-cups-icon--sm-client-iddefault3

    Now, I wonder why that's spawning so many threads?

  10. Dave Welton said about 5 hours later:

    And the gnome cups thing is spawning threads in order to do some HTTP calls to cups without blocking, apparently. Didn't think it was anything important, but it's sort of annoying all the same.

  11. Steve said about 13 hours later:

    You can log all processes with snoopy:

    http://www.debian-administration.org/articles/88

  12. Stuart Yeates said 1 day later:

    What are your cron settings?

    I log authentication, so my logs are full of cron waking up, shedding previledges, forking a process to check whether any jobs need running, finding that none are, and exiting.

  13. BenoƮt Dejean said 1 day later:

    http://bugzilla.gnome.org/show_bug.cgi?id=439930

Trackbacks

Use the following link to trackback from your own site:
http://journal.dedasys.com/articles/trackback/1799

Comments are disabled