Your program forks children, but the dead children accumulate, fill up your process table, and aggravate your system administrator.
If you don't need to record the children that have terminated, use:
$SIG{CHLD} = 'IGNORE';
To keep better track of deceased children, install a
SIGCHLD handler to call
waitpid
:
use POSIX ":sys_wait_h"; $SIG{CHLD} = \&REAPER; sub REAPER { my $stiff; while (($stiff = waitpid(-1, &WNOHANG)) > 0) { # do something with $stiff if you want } $SIG{CHLD} = \&REAPER; # install *after* calling waitpid }
When a process exits, the system keeps it in the process table so the parent can check its status - whether it terminated normally or abnormally. Fetching a child's status (thereby freeing it to drop from the system altogether) is rather grimly called
reaping
dead children. (This entire recipe is full of ways to harvest your dead children. If this makes you queasy, we understand.) It involves a call to
wait
or
waitpid
. Some Perl functions (piped
open
s,
system
, and backticks) will automatically reap the children they make, but you must explicitly wait when you use
fork
to manually start another process.
To avoid accumulating dead children, simply tell the system that you're not interested in them by setting
$SIG{CHLD}
to
"IGNORE"
. If you want to know which children die and when, you'll need to use
waitpid
.
The
waitpid
function reaps a single process. Its first argument is the process to wait for - use
-1
to mean any process - and its second argument is a set of flags. We use the WNOHANG flag to make
waitpid
immediately return
0
if there are no dead children. A flag value of
0
is supported everywhere, indicating a blocking wait. Call
waitpid
from a SIGCHLD handler, as we do in the Solution, to reap the children as soon as they die.
The
wait
function also reaps children, but it does not have a non-blocking option. If you inadvertently call it when there are running child processes but none have exited, your program will pause until there is a dead child.
Because the kernel keeps track of undelivered signals using a bit vector, one bit per signal, if two children die before your process is scheduled, you will get only a single SIGCHLD. You must always loop when you reap in a SIGCHLD handler, and so you can't use
wait
.
Both
wait
and
waitpid
return the process ID that they just reaped and set
$?
to the wait status of the defunct process. This status is actually two 8-bit values in one 16-bit number. The high byte is the exit value of the process. The low 7 bits represent the number of the signal that killed the process, with the 8th bit indicating whether a core dump occurred. Here's one way to isolate those values:
$exit_value = $? >> 8; $signal_num = $? & 127; $dumped_core = $? & 128;
The standard POSIX module has macros to interrogate status values: WIFEXITED, WEXITSTATUS, WIFSIGNALLED, and WTERMSIG. Oddly enough, POSIX doesn't have a macro to test whether the process core dumped.
Beware of two things when using SIGCHLD. First, the system doesn't just send a SIGCHLD when a child exits; it also sends one when the child stops. A process can stop for many reasons - waiting to be foregrounded so it can do terminal I/O, being sent a SIGSTOP (it will wait for the SIGCONT before continuing), or being suspended from its terminal. You need to check the status with the
WIFEXITED
[
1
] function from the POSIX module to make sure you're dealing with a process that really died, and isn't just suspended.
[1] Not
SPOUSEXITED
, even on a PC.
use POSIX qw(:signal_h :errno_h :sys_wait_h); $SIG{CHLD} = \&REAPER; sub REAPER { my $pid; $pid = waitpid(-1, &WNOHANG); if ($pid == -1) { # no child waiting. Ignore it. } elsif (WIFEXITED($?)) { print "Process $pid exited.\n"; } else { print "False alarm on $pid.\n"; } $SIG{CHLD} = \&REAPER; # in case of unreliable signals }
The second trap with SIGCHLD is related to Perl, not the operating system. Because
system
,
open
, and backticks all fork subprocesses and the operating system sends your process a SIGCHLD whenever any of its subprocesses exit, you could get called for something you weren't expecting. The built-in operations all wait for the child themselves, so sometimes the SIGCHLD will arrive before the
close
on the filehandle blocks to reap it. If the signal handler gets to it first, the zombie won't be there for the normal close. This makes
close
return false and set
$!
to
"No
child
processes"
. Then, if the
close
gets to the dead child first,
waitpid
will return
0
.
Most systems support a non-blocking
waitpid
. Use Perl's standard Config.pm module to find out:
use Config; $has_nonblocking = $Config{d_waitpid} eq "define" || $Config{d_wait4} eq "define";
System V defines SIGCLD, which has the same signal number as SIGCHLD but subtly different semantics. Use SIGCHLD to avoid confusion.
The
"Signals"
sections in
Chapter 6
of
Programming Perl
and in
perlipc
(1); the
wait
and
waitpid
functions in
Chapter 3
of
Programming Perl
and in
perlfunc
(1); the documentation for the standard POSIX module, in
Chapter 7
of
Programming Perl
; your system's
sigaction
(2),
signal
(3), and
kill
(2) manpages (if you have them);
Recipe 16.17