how to monitor is a linux server is up or has been rebboted

Hi,

We are having issue with our tools that monitor the servers/OS side  and we are trying to find a solution in foglight

We are looking at monitoring if a server is up/down or if I has been rebooted

If anybody has an idea how to do this

Thank you

PC

  • Hi Philippe,

    Have you looked at the Netmonitor agent that is part of the Infrastructure cartridge?
    You'd have it included with your license so long as a monitored DB is licensed for the Linux server in question. That server would also need to respond to a ping request. But that will give you up/down and availability.

    Regards,

    -Darren

  • Thank you Darren, I'll have a look at it. 

  • Fortunately, there are a few approaches you can consider to achieve this. One option is to leverage the monitoring capabilities provided by Foglight itself. Foglight offers various features and plugins that can be configured to monitor the status of servers and detect any reboots. By setting up the appropriate monitoring parameters and thresholds, you can receive alerts or notifications whenever a server goes offline or undergoes a reboot.

    Another alternative is to utilize existing monitoring tools or scripts within your server's operating system. Most operating systems provide built-in tools or command-line utilities that can be used to check the status of servers. For example, on Linux systems, you can use commands like "ping" to check if a server is responsive, or tools like "uptime" to determine the time since the last reboot. On Windows systems, you can employ commands like "ping" or PowerShell scripts to achieve similar results.

  • One thing that worked for me was setting up custom rules in Foglight to monitor server status. Essentially, you can create alerts that trigger when a server goes down or restarts. This involves scripting conditions that check for server uptime or sudden changes in system metrics. Foglight's flexibility with custom scripts is super handy here.

  • If you're monitoring this Linux server with a UnixAgentPlus agent, you should already be able use the "Host Monitored" rule.

    • This rule will fire an UNMONITORED Warning alarm if it can't collect metrics in a collection Interval.
    • It can fire an UNAVAILABLE Critical alarm if the UnixAgentPlus agent is unable to reach that host by Ping and fails to collect metrics in a collection interval, but the agent needs to have "Use ping to validate host availability" activated.

    I hope this helps.

    -J. Nunez