diff options
author | danielsdeleo <dan@opscode.com> | 2013-09-09 16:45:43 -0700 |
---|---|---|
committer | danielsdeleo <dan@opscode.com> | 2013-09-09 16:45:43 -0700 |
commit | 89f7f802e3bc5ab92c32eff7f065acdebff8492c (patch) | |
tree | 676e902afb6e71fa8af832d0a3c3c2dac77785d0 | |
parent | 4c1e6fee46638a668e0b05914371d9271f590a86 (diff) | |
download | mixlib-shellout-89f7f802e3bc5ab92c32eff7f065acdebff8492c.tar.gz |
Wait for child process to die in case of critical errors
Ported the regression test from OHAI-455 to mixlib-shellout, which
revealed a race condition in the way shellout reaps the child after a
failed exec (most commonly caused by attempting to run a command that
doesn't exist). The call to waitpid2 used WNOHANG to avoid hanging
indefinitely if an error was caused but the child process was still
alive, but this results in the child process not getting reaped if it
exits after the call to waitpid2. In a single run of the stress test,
this occurred 94/100 times, so it is very likely that mixlib-shellout
will leak zombies for the ENOENT case without this change.
-rw-r--r-- | lib/mixlib/shellout/unix.rb | 2 | ||||
-rw-r--r-- | spec/mixlib/shellout_spec.rb | 32 |
2 files changed, 33 insertions, 1 deletions
diff --git a/lib/mixlib/shellout/unix.rb b/lib/mixlib/shellout/unix.rb index 55a0372..af3e421 100644 --- a/lib/mixlib/shellout/unix.rb +++ b/lib/mixlib/shellout/unix.rb @@ -84,7 +84,7 @@ module Mixlib self rescue Exception # do our best to kill zombies - Process.waitpid2(@child_pid, Process::WNOHANG) rescue nil + Process.waitpid2(@child_pid) rescue nil raise ensure # no matter what happens, turn the GC back on, and hope whatever busted diff --git a/spec/mixlib/shellout_spec.rb b/spec/mixlib/shellout_spec.rb index 89f21e7..23d9a13 100644 --- a/spec/mixlib/shellout_spec.rb +++ b/spec/mixlib/shellout_spec.rb @@ -777,6 +777,38 @@ describe Mixlib::ShellOut do end end + context "when running a command that doesn't exist" do + + let(:cmd) { "/bin/this-is-not-a-real-command" } + + def shell_out_cmd + Mixlib::ShellOut.new(cmd) + end + + it "reaps zombie processes after exec fails [OHAI-455]" do + # NOTE: depending on ulimit settings, GC, etc., before the OHAI-455 patch, + # ohai could also exhaust the available file descriptors when creating this + # many zombie processes. A regression _could_ cause Errno::EMFILE but this + # probably won't be consistent on different environments. + created_procs = 0 + 100.times do + begin + shell_out_cmd.run_command + rescue Errno::ENOENT + created_procs += 1 + end + end + created_procs.should == 100 + reaped_procs = 0 + begin + loop { Process.wait(-1); reaped_procs += 1 } + rescue Errno::ECHILD + end + reaped_procs.should == 0 + end + end + + context 'with open files for parent process' do before do @test_file = Tempfile.new('fd_test') |