[monit-dev] control: consume less CPU time while waiting for a process to start/stop

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[monit-dev] control: consume less CPU time while waiting for a process to start/stop

Thomas Petazzoni-2
control: consume less CPU time while waiting for a process to start/stop

Monit uses the wait_start() and wait_stop() functions when waiting for
a given process to start and stop. In those functions, Monit polls the
state of those processes to find when they have started or stopped,
during a given timeout (by default EXEC_TIMEOUT, 30 seconds).

Unfortunately, during this period, Monit polls very aggressively: it
polls every 5 milliseconds to see if the process has started or
not. Since every 5 milliseconds Monit scans a quite important number
of files in /proc, it makes Monit consumes about 60-70% of the CPU
time on a faily low-end ARMv5 device (200 Mhz). And when the process
Monit tries to start effectively never starts (for some reason), then
Monit consumes 60-70% of the CPU time forever.

This value of 5 milliseconds being way to aggressive, we take a
different approach:

 * During the first second, we poll every 200ms to see if the process
   has started or has stopped

 * During the rest of the timeout period, we poll every second

Signed-off-by: Thomas Petazzoni <[hidden email]>

Index: b/src/control.c
===================================================================
--- a/src/control.c
+++ b/src/control.c
@@ -418,11 +418,18 @@
         int isrunning = FALSE;
         time_t timeout = time(NULL) + s->start->timeout;
         int debug = Run.debug;
+        unsigned waiting_cycles = 0;
         ASSERT(s);
         while ((time(NULL) < timeout) && !Run.stopped) {
                 if ((isrunning = Util_isProcessRunning(s, TRUE)))
                         break;
-                Time_usleep(5000);
+                /* Test if the process exists every 200ms during the
+                   first second, and then every second */
+                if (waiting_cycles < 5)
+                  Time_usleep(200 * 1000);
+                else
+                  Time_usleep(1000 * 1000);
+                waiting_cycles++;
                 Run.debug = 0; // Turn off debug second time through to avoid flooding the log with pid file does not exist. This poll stuff here _will_ be refactored away
         }
         Run.debug = debug;
@@ -446,11 +453,18 @@
         int isrunning = TRUE;
         time_t timeout = time(NULL) + s->stop->timeout;
         int debug = Run.debug;
+        unsigned waiting_cycles = 0;
         ASSERT(s);
         while ((time(NULL) < timeout) && !Run.stopped) {
                 if (! (isrunning = Util_isProcessRunning(s, TRUE)))
                         break;
-                Time_usleep(5000);
+                /* Test if the process stopped every 200ms during the
+                   first second, and then every second */
+                if (waiting_cycles < 5)
+                  Time_usleep(200 * 1000);
+                else
+                  Time_usleep(1000 * 1000);
+                waiting_cycles++;
                 Run.debug = 0; // Turn off debug second time through to avoid flooding the log with pid file does not exist. This poll stuff here _will_ be refactored away
         }
         Run.debug = debug;


--
Thomas Petazzoni, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com

_______________________________________________
monit-dev mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/monit-dev
Reply | Threaded
Open this post in threaded view
|

Re: [monit-dev] control: consume less CPU time while waiting for a process to start/stop

martinp@tildeslash.com
Hi Thomas,

thanks for the patch, the process status polling was really too agressive for low power devices. I have refactored the wait_start/wait_stop little bit more:

1.) the wait starts with 50ms and then doubles each cycle until 1s is reached (50ms -> 100ms -> 200ms -> 400ms -> 800ms -> 1s) and then monit continues polling every 1 second. This way it allows to detect fast starting/stopping services state quickly and gradually slow down the polling if the service state change takes longer.

2.) the wait will now always sleep for the initial period (50ms) - originally the wait was called immediately after start/stop program spawn which was too fast in most cases

3.) the wait is made resistant to large time changes

The fix will be part of upcoming Monit 5.4 which should be available tomorrow.

Regards,
Martin



On Apr 11, 2012, at 3:39 PM, Thomas Petazzoni wrote:

> control: consume less CPU time while waiting for a process to start/stop
>
> Monit uses the wait_start() and wait_stop() functions when waiting for
> a given process to start and stop. In those functions, Monit polls the
> state of those processes to find when they have started or stopped,
> during a given timeout (by default EXEC_TIMEOUT, 30 seconds).
>
> Unfortunately, during this period, Monit polls very aggressively: it
> polls every 5 milliseconds to see if the process has started or
> not. Since every 5 milliseconds Monit scans a quite important number
> of files in /proc, it makes Monit consumes about 60-70% of the CPU
> time on a faily low-end ARMv5 device (200 Mhz). And when the process
> Monit tries to start effectively never starts (for some reason), then
> Monit consumes 60-70% of the CPU time forever.
>
> This value of 5 milliseconds being way to aggressive, we take a
> different approach:
>
> * During the first second, we poll every 200ms to see if the process
>   has started or has stopped
>
> * During the rest of the timeout period, we poll every second
>
> Signed-off-by: Thomas Petazzoni <[hidden email]>
>
> Index: b/src/control.c
> ===================================================================
> --- a/src/control.c
> +++ b/src/control.c
> @@ -418,11 +418,18 @@
>         int isrunning = FALSE;
>         time_t timeout = time(NULL) + s->start->timeout;
>         int debug = Run.debug;
> +        unsigned waiting_cycles = 0;
>         ASSERT(s);
>         while ((time(NULL) < timeout) && !Run.stopped) {
>                 if ((isrunning = Util_isProcessRunning(s, TRUE)))
>                         break;
> -                Time_usleep(5000);
> +                /* Test if the process exists every 200ms during the
> +                   first second, and then every second */
> +                if (waiting_cycles < 5)
> +                  Time_usleep(200 * 1000);
> +                else
> +                  Time_usleep(1000 * 1000);
> +                waiting_cycles++;
>                 Run.debug = 0; // Turn off debug second time through to avoid flooding the log with pid file does not exist. This poll stuff here _will_ be refactored away
>         }
>         Run.debug = debug;
> @@ -446,11 +453,18 @@
>         int isrunning = TRUE;
>         time_t timeout = time(NULL) + s->stop->timeout;
>         int debug = Run.debug;
> +        unsigned waiting_cycles = 0;
>         ASSERT(s);
>         while ((time(NULL) < timeout) && !Run.stopped) {
>                 if (! (isrunning = Util_isProcessRunning(s, TRUE)))
>                         break;
> -                Time_usleep(5000);
> +                /* Test if the process stopped every 200ms during the
> +                   first second, and then every second */
> +                if (waiting_cycles < 5)
> +                  Time_usleep(200 * 1000);
> +                else
> +                  Time_usleep(1000 * 1000);
> +                waiting_cycles++;
>                 Run.debug = 0; // Turn off debug second time through to avoid flooding the log with pid file does not exist. This poll stuff here _will_ be refactored away
>         }
>         Run.debug = debug;
>
>
> --
> Thomas Petazzoni, Free Electrons
> Kernel, drivers, real-time and embedded Linux
> development, consulting, training and support.
> http://free-electrons.com
>
> _______________________________________________
> monit-dev mailing list
> [hidden email]
> https://lists.nongnu.org/mailman/listinfo/monit-dev


_______________________________________________
monit-dev mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/monit-dev
Reply | Threaded
Open this post in threaded view
|

Re: [monit-dev] control: consume less CPU time while waiting for a process to start/stop

Michael Shigorin
On Fri, May 04, 2012 at 09:08:13PM +0200, Martin Pala wrote:
> The fix will be part of upcoming Monit 5.4 which should be available tomorrow.

Yay, thank you both!

--
 ---- WBR, Michael Shigorin <[hidden email]>
  ------ Linux.Kiev http://www.linux.kiev.ua/

_______________________________________________
monit-dev mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/monit-dev
Reply | Threaded
Open this post in threaded view
|

Re: [monit-dev] control: consume less CPU time while waiting for a process to start/stop

Thomas Petazzoni-2
In reply to this post by martinp@tildeslash.com
Hello Martin,

Le Fri, 4 May 2012 21:08:13 +0200,
Martin Pala <[hidden email]> a écrit :

> thanks for the patch, the process status polling was really too
> agressive for low power devices. I have refactored the
> wait_start/wait_stop little bit more:
>
> 1.) the wait starts with 50ms and then doubles each cycle until 1s is
> reached (50ms -> 100ms -> 200ms -> 400ms -> 800ms -> 1s) and then
> monit continues polling every 1 second. This way it allows to detect
> fast starting/stopping services state quickly and gradually slow down
> the polling if the service state change takes longer.
>
> 2.) the wait will now always sleep for the initial period (50ms) -
> originally the wait was called immediately after start/stop program
> spawn which was too fast in most cases
>
> 3.) the wait is made resistant to large time changes
>
> The fix will be part of upcoming Monit 5.4 which should be available
> tomorrow.

Great, sounds good.

Thomas
--
Thomas Petazzoni, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com

_______________________________________________
monit-dev mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/monit-dev