sdf-mon/notes.org

2.4 KiB
Raw Permalink Blame History

Checks

Local

DONE ping

CLOSED: [2013-11-05 Wed 15:52]

Coproc - Using SSH instead

DONE CPU - per core?

CLOSED: [2013-11-06 Wed 15:49] This is implemented as a total load reading.

DONE Possibly need a 2s sleep before taking measurements

CLOSED: [2013-11-08 Wed 15:49] This didn't help the top cpu readings but taking for readings then using 'tail -n +2' to skip the first one and average the last three appears to give more consistante readings.

DONE Mem

CLOSED: [2013-11-06 Wed 15:49] This is polled from the 'free -m' output, excluding the disk cache from the computed used memory value.

DONE Disk

CLOSED: [2013-11-05 Wed 15:48] Just parsed 'df -h | tail -n +2' output through awk.

Top - not included in report

Not really useful, not used.

DONE Service - specified in config script

CLOSED: [2013-11-06 Wed 15:48] Services just check the running processes list because I couldn't find a reliable way to parse 'service serviceName status' output over SSH.

Features

Run as service - abandoned in favor of running as a crontask

rsyslog for DB log storage (stretch goal)

DONE email notifications

CLOSED: [2013-11-06 Wed 15:47] Email notifications appear to work but must be used with the heirloom mailx implementation. The BSD and GNU implementations didn't appear to work with Exchange or the ETS smtp server for whatever reason but the heirloom mailx binary works just fine. The advantage is that the smtp server can be explicitly specified rather than infered from mx records.

DONE Only send notifications on repeat problem

CLOSED: [2013-11-13 Wed 16:52] The idea is that instead of sending an email notification, it should write a trigger file so the next time the sript runs, if there is need to send a notification and that file exists, the notifications will be sent. This will cut down on getting an email every time the application is starting up or reindexing the content database.

Notifications are now only sent the second time that any given thresholds are broken. commit a0e2c72e32

Documentation

DONE Is/Isn't

CLOSED: [2013-11-14 Thu 11:37]

DONE SSH

CLOSED: [2013-11-14 Thu 11:37]

DONE Config File

CLOSED: [2013-11-14 Thu 12:00]

DONE cron

CLOSED: [2013-11-14 Thu 12:30]

DONE mailx

CLOSED: [2013-11-14 Thu 12:52]

DONE Top Parsing

CLOSED: [2013-11-14 Thu 13:01]