Process creation monitoring
Posted by David N. Welton Mon, 06 Aug 2007 09:53:00 GMT
So, I'm stumped - maybe someone knows the answer to this one. I noticed that on my new Ubuntu system, if I ran ps every 5 seconds or so, the process number jumped significantly, by 5 or 6. Being a bit of a control freak, I want to know exactly what's creating those processes, so I wrote a little Tcl script to try and catch any new [0-9]* directories in proc and read its cmdline file entry, but that seems to not be fast enough to catch the culprit in the act. I had a look at inotify, but that doesn't seem to work with /proc. So at this point, I'm stumped, and am looking around for suggestions. I just want to have a record of newly created processes... seems like it ought to be possible.

You could try getting the info with systemtap http://sourceware.org/systemtap/
There are some kernel patches people have been trying to integrate recently to provide this sort of instrumentation. You really do need kernel support to close the races with short lived processes.
If you have some idea which process is creating new processes you could try ptrace, I expect...
Each new thread, not only processes increments the PID. use "ps -eLf" to look at the threads(LWP) in a process. you don't find them in /proc. A multithreaded program example:
"ps -eLf" returns 6564 as the PID and 6565-6580 as LWPs for this pid
"ls /proc | grep 6565" returns nothing, but "ls /proc/6565" prints the directory listing for this LWP. This means the /proc directory entries are fake and created on demand. Use systemtop as Mikael Olen suggested to instrument exec/clone calls in the kernel.
What if you do a "ps" every 10 seconds or so? Does the process number jump by 10 or 12, or still just by 5 or 6?
If it's still just 5 or 6, then the short-lived processes might be a consequence of running "ps". Not sure why, but it's a possibility. Try running "strace" on your shell as you launch "ps" to see what happens, or "ps" under "strace". "strace" has options to have it only display fork/exec type calls to cut the output down to something manageable.
Alternatively, write a script to keep "ps"ing in a loop and store the output of each in a file. You'll probably catch one of them after a while as a process that shows up in only one ps list.
Hi,
You could also try the BSD process accounting facility of the linux kernel (General Setup->BSD Process Accounting) and install the "acct" debian package.
easy. 'acctail' will give you this information nicely. URL: http://www.vanheusden.com/acctail/ it uses the bsd process accounting facility
The connector reports process events to userspace. It uses the netlink mechanism and your kernel must be built with the following configuration options:
Device Drivers --->
Connector - unified userspace<->kernelspace linker --->
<*> Connector unified userspace <-> kernelspace linker [*] Report process events to userspace
This option is available since 2.6.15.
accttail doesn't do the job, because it's looking at processes, and I think the posters focusing on threads are on the right trail. The only process I can find doing anything is postgresql doing some cleanups, but even turning that off, I get the mystery threads...
Even so, thanks for pointing it out - it's a neat program that I'll keep in mind for the future.
Ok, the solution!
I looked at how ps -eLF was doing things, and modified my Tcl code accordingly:
It's quick and dirty, compared with something like systemtap, but requires no kernel recompile or modifications, and was able to find the culprit:
gnome-cups-icon--sm-client-iddefault3
Now, I wonder why that's spawning so many threads?
And the gnome cups thing is spawning threads in order to do some HTTP calls to cups without blocking, apparently. Didn't think it was anything important, but it's sort of annoying all the same.
You can log all processes with snoopy:
http://www.debian-administration.org/articles/88
What are your cron settings?
I log authentication, so my logs are full of cron waking up, shedding previledges, forking a process to check whether any jobs need running, finding that none are, and exiting.
http://bugzilla.gnome.org/show_bug.cgi?id=439930