[arch-projects] [initscripts] [PATCH 09/11] functions: Speed up reboot/shutdown by recognizing killall5 exit code 2

Dave Reisner d at falconindy.com
Sat Jul 2 16:09:15 EDT 2011


On Sat, Jul 02, 2011 at 09:56:34PM +0200, Kurt J. Bosch wrote:
> Dave Reisner, 2011-07-02 21:05:
> >On Sat, Jul 02, 2011 at 08:44:27PM +0200, Kurt J. Bosch wrote:
> >>killall5 returns 2 if it didn't kill any processes. Using this avoids sleeping longer than needed. This saves another up to six seconds of reboot/shutdown/go-single time.
> >>---
> >>  functions |   16 ++++++++++++----
> >>  1 files changed, 12 insertions(+), 4 deletions(-)
> >>
> >>diff --git a/functions b/functions
> >>index 69f06eb..401e323 100644
> >>--- a/functions
> >>+++ b/functions
> >>@@ -293,13 +293,21 @@ kill_everything() {
> >>
> >>  	# Terminate all processes
> >>  	stat_busy "Sending SIGTERM To Processes"
> >>-		killall5 -15 ${omit_pids[@]/#/-o }&>/dev/null
> >>-		sleep 5
> >>+		local i
> >>+		for (( i=0; i<500; i+=25 )); do
> >>+			killall5 -15 ${omit_pids[@]/#/-o }&>/dev/null
> >>+			(( $? == 2 ))&&  break
> >>+			sleep .25
> >>+		done
> >
> >In the context of killall5, 'killed' means a signal was sent. This will
> >cause a zombie process to hang shutdown for 2 minutes.
> 
> Sorry, but i can't see how sending multiple SIGTERM to a zombie
> should cause any problem. Why 2 minutes? Could you please explain
> this a bit more?
> 

A zombie will receive signals but will never act on them. Therefore,
killall5 will never give you the exit value of 2 that is needed to break
this loop early. See note below about centiseconds...

> >
> >>  	stat_done
> >>
> >>  	stat_busy "Sending SIGKILL To Processes"
> >>-		killall5 -9 ${omit_pids[@]/#/-o }&>/dev/null
> >>-		sleep 1
> >>+		local i
> >>+		for (( i=0; i<100; i+=25 )); do
> >>+			killall5 -9 ${omit_pids[@]/#/-o }&>/dev/null
> >>+			(( $? == 2 ))&&  break
> >>+			sleep .25
> >>+		done
> >
> >Ideally, this never kills anything, because all our processes exited
> >nicely in the loop above. However, if it does successfully send a signal
> >to a process, then we just spent the past 125 seconds waiting on the
> >above SIGTERM spam to time out. When it does time out, we're going to
> >spend another 25 seconds here waiting for the same process to be spammed
> >with SIGKILL.
> 
> We would spend up to 5 seconds for waiting above - not more as
> without the patch, but maybe less. Here we spend up to 1 second as
> before.
> Note: Time is measured in centiseconds here.

I failed to notice the i+=25 bit. But why? ((i=0; i<4; i++)) or even
{1..4} is much more readable and accomplishes the same. Barring that, a
comment would be nice here for those skimming the code.

> >
> >>  	stat_done
> >>
> >>  	run_hook "$1_postkillall"
> >>--
> >>1.7.1
> >>
> >
> >There's probably a handful of programs who don't appreciate receiving
> >SIGTERM every 1/4 of a second.
> >
> >dave
> >
> Works good here. Would you recommend some longer interval?
> 

Spamming signals until you get a magical return value seems like a hack
solution. It's not that I disagree with the interval so much as I
disagree with the approach.

dave



More information about the arch-projects mailing list