Sometimes when restarting Monit it resets all services to unmonitored

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Sometimes when restarting Monit it resets all services to unmonitored

Ciprian Dorin Craciun
Hello all!

I'm using Monit 5.25.1 (on OpenSUSE 15.1), and I sometimes encounter
the following strange issue:  either on restarting monit (via
`systemd`) or rebooting the system I sometimes have all the services
(and other monitored items) reset as unmonitored.

As additional details:
* I do not use `start` / `stop` / `unmonitor` as action to any of my
tests  (just `alert`);
* I do explicitly use `mode passive onreboot nostart` on every monitored item;
* the `state` and `id` files do persist on reboots;

I haven't been able to reliably reproduce this, but it does happen
from time-to-time.

Thanks,
Ciprian.

Reply | Threaded
Open this post in threaded view
|

Re: Sometimes when restarting Monit it resets all services to unmonitored

Lutz Mader
Hello Ciprian,
are you using one of the distribution specific monit packages or the
tildeslash monit package.

> I'm using Monit 5.25.1 (on OpenSUSE 15.1), and I sometimes encounter> the following strange issue:  either on restarting monit (via>
`systemd`) or rebooting the system I sometimes have all the services>
(and other monitored items) reset as unmonitored.
Check the used scripts in /etc/init.d please.

A suggestion only,
Lutz

Reply | Threaded
Open this post in threaded view
|

Re: Sometimes when restarting Monit it resets all services to unmonitored

Ciprian Dorin Craciun
On Wed, Aug 5, 2020 at 10:39 PM Lutz Mader <[hidden email]> wrote:
> Hello Ciprian,
> are you using one of the distribution specific monit packages or the
> tildeslash monit package.


I am using the distribution specific package.  Although the patches
added by OpenSUSE are minimal according to their build and they
involve mainly the configuration file:

    https://build.opensuse.org/package/show/openSUSE:Leap:15.1/monit


> Check the used scripts in /etc/init.d please.

I use systemd with a custom service unit.

Basically it starts monit as `monit -c /.../conf -I`.

Other settings pertaining to shutdown (or that would trigger an
improper shutdown) would be:

~~~~
# when stopping send SIGTERM, wait 60s then SIGKILL
KillSignal = SIGTERM
SendSIGKILL = true
TimeoutStopSec = 60s

# limit number of processes / threads and memory
TasksMax = 32
MemoryMax = 256M
~~~

Ciprian.

Reply | Threaded
Open this post in threaded view
|

Re: Sometimes when restarting Monit it resets all services to unmonitored

Lutz Mader
Hello Ciprian,
I will have look to
https://software.opensuse.org/download/package?package=monit&project=openSUSE%3ALeap%3A15.1

> I am using the distribution specific package.  Although the patches
> added by OpenSUSE are minimal according to their build and they
> involve mainly the configuration file:
>
>     https://build.opensuse.org/package/show/openSUSE:Leap:15.1/monit 

But based on a fast check, I can not find some additional scripts to do
a shutdown in the package monit-5.25.1-lp151.2.3.x86_64.rpm

Some other plattform specific monit packages use additional script
samples to give some help. But sometimes the scripts are not useful.

With regards,
Lutz

Reply | Threaded
Open this post in threaded view
|

Re: Sometimes when restarting Monit it resets all services to unmonitored

Ciprian Dorin Craciun
On Thu, Aug 6, 2020 at 9:03 AM Lutz Mader <[hidden email]> wrote:
> But based on a fast check, I can not find some additional scripts to do
> a shutdown in the package monit-5.25.1-lp151.2.3.x86_64.rpm
>
> Some other plattform specific monit packages use additional script
> samples to give some help. But sometimes the scripts are not useful.


I would exclude the possibility of explicit Monit commands on shutdown
or restart, due to the fact that I use a custom (from ground-up)
systemd unit, which besides the usual systemd "incantations" and the
resource limiting highlighted above, just calls `monit -c ... -I`;
nothing else is called on shutdown or restart.

My hunch is that something happens and corrupts the state file, thus
Monit recreates one from scratch;  the initial state of the services
(without a proper state file) seems to be unmonitored, thus the
situation.

Ciprian.

Reply | Threaded
Open this post in threaded view
|

Re: Sometimes when restarting Monit it resets all services to unmonitored

Ciprian Dorin Craciun
In reply to this post by Ciprian Dorin Craciun
I have just rebooted one of the systems, and once restarted again I
encountered the "unmonitored" status.

This time I've checked the logs:
~~~~
Monit daemon with pid [32657] stopped
'system' Monit 5.25.1 stopped
Starting Monit 5.25.1 daemon with http interface at [127.156.202.230]:8080
Monit start delay set to 30s
'system' Monit 5.25.1 started
~~~~

And all my services have the status:
~~~~
  status                       Not monitored
  monitoring status            Not monitored
  monitoring mode              passive
  on reboot                    nostart
~~~~

So, given that I can't reproduce this on (all) normal monit restart,
but only on (most) reboots, I am wondering if there Monit doesn't
somehow detect a reboot and does something different in that case?

Ciprian.

Reply | Threaded
Open this post in threaded view
|

Re: Sometimes when restarting Monit it resets all services to unmonitored

Ciprian Dorin Craciun
In reply to this post by Ciprian Dorin Craciun
OK...  I've tried the following combinations and these were the outcomes:

* `mode passive onreboot nostart` -- on system reboot none of the
services or items are monitored;
* `mode passive` -- on system reboot **all** services and items are
monitored, regardless if before the reboot some were marked an
"unmonitored";
* `mode passive onreboot laststate` -- on system reboot Monit appears
to keep the monitored state as was before the reboot;

Additional observations:
* this monitoring reset seems to happen **only** on reboot, but
**not** on restarting of the Monit service;  my assumption is that
Monit detects a system reboot and behaves differently;
* although I have `start` / `stop` / `restart` commands defined, and I
have `if not exist then alert`, I don't know if Monit tries or not to
start some services automatically;  (based on what Lutz Mader said, I
would assume it won't;)

My conclusion:  the "correct" setting should be `mode passive onreboot
laststate`.

Hope it helps,
Ciprian.

Reply | Threaded
Open this post in threaded view
|

Re: Sometimes when restarting Monit it resets all services to unmonitored

Lutz Mader
Hello Ciprian,
this is a useful setting.

> My conclusion:  the "correct" setting should be `mode passive onreboot
> laststate`.

As long as you are aware that all services are "stopped"/"not monitored"
not monitored again. But you should get no problems because you said
monit is used for monitoring purpose only.

To do a "restart" test without a restart I stop monit and delete the
".monit.state" file. After monit is up again, monit can not detect the
last state and think it is a restart.

With regards,
Lutz

Reply | Threaded
Open this post in threaded view
|

Re: Sometimes when restarting Monit it resets all services to unmonitored

Ciprian Dorin Craciun
On Fri, Aug 7, 2020 at 9:31 PM Lutz Mader <[hidden email]> wrote:
> To do a "restart" test without a restart I stop monit and delete the
> ".monit.state" file. After monit is up again, monit can not detect the
> last state and think it is a restart.

But this is what seems strange:  in a "normal" deployment, both the
`state` and `id` files are placed under `/run`, and thus on many
distributions lost after a reboot.

However in my case, I make sure I keep these files in a persistent
folder on the disk, as such they are not lost on reboot.

Therefore I still suspect that Monit has some "reboot" detection
outside of those two files.

Ciprian.