A collection of checks for your validators that are Nagios/Icinga2 compatible, or can be run in an idempotent manner. The primary goal was to achieve an idempotent health check for determining the status of a validator.
Some challenges include testing for the advancement in block height. This infers passage of time, so it is difficult to check idempotently.
- Python 3.8+
- PyPI requests package
- pip install requests
- Debian/Ubuntu, RedHat/CentOS/Fedora package
- python3-requests
Compare the latest release version with the running version.
check_polkadot_release -s localhost -p 9944
Warning after 6 hours, Critical after 24 hours since latest release.
check_polkadot_release -s localhost -p 9944 \
--since-release-warn 21600 \
--since-release-crit 86400
=======
- PyPI websockets package
- pip install requests
- pip install websockets
- Debian/Ubuntu, RedHat/CentOS/Fedora package
- python3-requests
- python3-websockets
Compare the latest release version with the running version.
check_polkadot_release -s localhost -p 9944
Warning after 6 hours, Critical after 24 hours since latest release.
check_polkadot_release -s localhost -p 9944 \
--since-release-warn 21600 \
--since-release-crit 86400
Perform a series of checks to confirm the validator is "active". Some checks include:
- Currently syncing
- Number of peers
- Block height distance is too great
- Block numbers are not increasing over time
check_polkadot_validator_active -s localhost -p 9944 \
--min-peers 5 \
--max-distance-warn 5 \
--max-distance-crit 8 \
--best-timeout 14 \
--finalized-timeout 28
You can also specify another node to compare block numbers with.
The difference will use the same --max-distance
thresholds.
Warning: This can lead to false positives and false negatives! It may be useful in some circumstances, so it is available.
By specifying --compare-to-uri
, the best and finalized block numbers from the
provided source will be used (as opposed to watching the node's block numbers
until they increase within the defined timeout periods).
For example, if your node is working but the provided node is not increasing, it will raise an alert. Also, if your node and the provided node are both not increasing, it will not alert to an issue.
check_polkadot_validator_active -s localhost -p 9944 \
--min-peers 5 \
--max-distance-warn 5 \
--max-distance-crit 8 \
--compare-to-uri wss://kusama-rpc.polkadot.io