Check the age of a process

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Check the age of a process

David Jones
I searched the manual and Google'd but didn't find a way to do this natively inside monit.  Currently I am running a Nagios plugin script from Nagios Exchange.  Seems like this would be a natural addition to monit.

I have some scripts that our support team can run on a server that are like a tail command from our web filtering for troubleshooting blocks/allows.  I want to know if someone has left one running too long forgetting to close it.  In fact, it would nice to be able to kill a specific PID of the one that is running too long.  If it's possible to set an environment variable or something that could be passed to the exec, that would be perfect.

check process filter_grep_age matching filter_grep
    if timestamp is older than 10 hours then exec "/bin/kill ${FILTER_GREP_AGE_PID}"
    if timestamp is older than 8 hours then alert

Thanks,
Dave
Reply | Threaded
Open this post in threaded view
|

Re: Check the age of a process

Lutz Mader
Hello Dave,
a process uptime test ist available, see
https://mmonit.com/monit/documentation/monit.html#UPTIME-TEST

The uptime test is availabel for the process and system service
definitions only.

> Example of restarting the process every three days:
>
>  check process myapp with pidfile /var/run/myapp.pid
>     start program = "/etc/init.d/myapp start"
>     stop program = "/etc/init.d/myapp stop"
>     if uptime > 3 days then restart

With regards,
Lutz

p.s.
The test is available since some years (I find the test in Monit 5.16).
Check the used Monit version with the command "monit -V".

Reply | Threaded
Open this post in threaded view
|

Re: Check the age of a process

monit-general mailing list
What if I wanted to check for something running too long?  I tried this and it's alerting for not being present or for having less uptime.  Is it possible to negate the alerting logic so it's OK when something is not running or it's less that a certain age?

check process filter_grep_age matching filter_grep
    if uptime > 7 hours then alert

This filter_grep command shouldn't run for more than a few hours or a user has forgotten to stop it.


From: monit-general <monit-general-bounces+djones=[hidden email]> on behalf of Lutz Mader <[hidden email]>
Sent: Thursday, August 22, 2019 1:29 AM
To: This is the general mailing list for monit <[hidden email]>
Subject: Re: Check the age of a process
 
Hello Dave,
a process uptime test ist available, see
https://mmonit.com/monit/documentation/monit.html#UPTIME-TEST

The uptime test is availabel for the process and system service
definitions only.

> Example of restarting the process every three days:
>
>  check process myapp with pidfile /var/run/myapp.pid
>     start program = "/etc/init.d/myapp start"
>     stop program = "/etc/init.d/myapp stop"
>     if uptime > 3 days then restart

With regards,
Lutz

p.s.
The test is available since some years (I find the test in Monit 5.16).
Check the used Monit version with the command "monit -V".

Reply | Threaded
Open this post in threaded view
|

Re: Check the age of a process

Szépe Viktor
Monit has more and more features.
Other things should be implemented as shell scripts.
This is my take on long running cron jobs
https://github.com/szepeviktor/debian-server-tools/blob/master/monitoring/cron-long.sh
installed on all production servers.


Idézem/Quoting David Jones via This is the general mailing list for  
monit <[hidden email]>:

> What if I wanted to check for something running too long?  I tried  
> this and it's alerting for not being present or for having less  
> uptime.  Is it possible to negate the alerting logic so it's OK when  
> something is not running or it's less that a certain age?
>
> check process filter_grep_age matching filter_grep
>     if uptime > 7 hours then alert
>
> This filter_grep command shouldn't run for more than a few hours or  
> a user has forgotten to stop it.


SZÉPE Viktor, webes alkalmazás üzemeltetés / Running your application
https://github.com/szepeviktor/debian-server-tools/blob/master/CV.md
~~~
ügyelet/hotline: +36-20-4242498  [hidden email]  skype: szepe.viktor
Budapest, III. kerület






Reply | Threaded
Open this post in threaded view
|

Re: Check the age of a process

monit-general mailing list
check program filter_grep_age with path "/usr/lib64/nagios/plugins/check_proc_age.sh -p filter_grep -w 28800 -c 36000" every 10 cycles
    if status == 2 for 3 cycles then alert
    if status == 1 for 3 cycles then alert


From: monit-general <monit-general-bounces+djones=[hidden email]> on behalf of SZÉPE Viktor <[hidden email]>
Sent: Thursday, August 22, 2019 12:35 PM
To: [hidden email] <[hidden email]>
Subject: Re: Check the age of a process
 
Monit has more and more features.
Other things should be implemented as shell scripts.
This is my take on long running cron jobs
https://github.com/szepeviktor/debian-server-tools/blob/master/monitoring/cron-long.sh
installed on all production servers.


Idézem/Quoting David Jones via This is the general mailing list for 
monit <[hidden email]>:

> What if I wanted to check for something running too long?  I tried 
> this and it's alerting for not being present or for having less 
> uptime.  Is it possible to negate the alerting logic so it's OK when 
> something is not running or it's less that a certain age?
>
> check process filter_grep_age matching filter_grep
>     if uptime > 7 hours then alert
>
> This filter_grep command shouldn't run for more than a few hours or 
> a user has forgotten to stop it.


SZÉPE Viktor, webes alkalmazás üzemeltetés / Running your application
https://github.com/szepeviktor/debian-server-tools/blob/master/CV.md
~~~
ügyelet/hotline: +36-20-4242498  [hidden email]  skype: szepe.viktor
Budapest, III. kerület






Reply | Threaded
Open this post in threaded view
|

Re: Check the age of a process

martinp@tildeslash.com
In reply to this post by monit-general mailing list
Yes, the uptime test can be used this way.

The process check will however alert if the process is not running - you can suppress this alert though: https://mmonit.com/monit/documentation/monit.html#Setting-an-event-filter


Cheers,
Martin



On 22 Aug 2019, at 19:28, David Jones via This is the general mailing list for monit <[hidden email]> wrote:

What if I wanted to check for something running too long?  I tried this and it's alerting for not being present or for having less uptime.  Is it possible to negate the alerting logic so it's OK when something is not running or it's less that a certain age?

check process filter_grep_age matching filter_grep
    if uptime > 7 hours then alert

This filter_grep command shouldn't run for more than a few hours or a user has forgotten to stop it.


From: monit-general <[hidden email]> on behalf of Lutz Mader <[hidden email]>
Sent: Thursday, August 22, 2019 1:29 AM
To: This is the general mailing list for monit <[hidden email]>
Subject: Re: Check the age of a process
 
Hello Dave,
a process uptime test ist available, see
https://mmonit.com/monit/documentation/monit.html#UPTIME-TEST

The uptime test is availabel for the process and system service
definitions only.

> Example of restarting the process every three days:
>
>  check process myapp with pidfile /var/run/myapp.pid
>     start program = "/etc/init.d/myapp start"
>     stop program = "/etc/init.d/myapp stop"
>     if uptime > 3 days then restart

With regards,
Lutz

p.s.
The test is available since some years (I find the test in Monit 5.16).
Check the used Monit version with the command "monit -V".

Reply | Threaded
Open this post in threaded view
|

Re: Check the age of a process

Lutz Mader
In reply to this post by monit-general mailing list
Hello David,
nice idea, have a look to
https://mmonit.com/monit/documentation/monit.html#UPTIME-TEST
https://mmonit.com/monit/documentation/monit.html#EXIST

Try to combine "exits" and "uptime" to stop a running process.

check process sleep matching "sleep"
  start program "/bin/ksh -c '/bin/echo start; /bin/sleep 300 &'"
  stop program "/bin/ksh -c 'kill -9 `ps -fu $USER | awk "/[Ss]leep/ {
print \\\\$2 }"`'"
  if uptime > 3 minutes then alert
  if uptime > 5 minutes then exec "/bin/ksh -c 'kill -9 `ps -fu $USER |
awk "/[Ss]leep/ { print \\\\$2 }"`'"
  if exist for 10 cycles then exec "/bin/ksh -c 'kill -9 `ps -fu $USER |
awk "/[Ss]leep/ { print \\\\$2 }"`'"

As long as the process is available the status is "Exists" and if the
process is not available the status is "OK".

Unfortunately you can not use "stop" to stop the process because this
will set the monitor status to "unmonitor", this will stop monitoring
the process in general.

Monit check all processes called "sleep" and send an alert after 3
minutes and stop the process after 5 minutes. The exists test stop the
process also.

A sample only,
Lutz