Node Health
This document describes how you can verify and monitor the health of your validator and validator fullnode (VFN) in the Aptos network. Many of the methods described here rely on the runtime metrics that your nodes collect and report. These metrics are collected by the Aptos node binary and are exposed via a Prometheus metrics endpoint. For a detailed description of the important metrics, see the Node Inspection Service and Important Node Metrics documentation.
Initial Node Verification
After deploying your nodes and connecting them to the Aptos network, you should verify that your nodes are running correctly.
In some environments, e.g., mainnet
and testnet
, your VFN will begin syncing first (before your validator is able to sync).
This is normal behaviour. Once your VFN has finished syncing, your validator node will start syncing and eventually start participating in consensus.
You can verify the correctness of your nodes by inspecting several simple metrics. Follow these steps:
-
Check if your nodes are state syncing by running this command:
curl 127.0.0.1:9101/metrics 2> /dev/null | grep "aptos_state_sync_version"
You should expect to see the
synced
orsynced_states
versions increasing. The versions should start increasing for your VFN first, then eventually your validator node will start syncing.Cloud deployment?You may need to replace
127.0.0.1
with your validator or VFN IP/DNS if deployed on the cloud. -
Verify that your validator is connecting to other peers on the network.
curl 127.0.0.1:9101/metrics 2> /dev/null | grep "aptos_connections{.*\"Validator\".*}"
The command will output the number of inbound and outbound connections of your validator node. For example:
aptos_connections{direction="inbound",network_id="Validator",peer_id="f326fd30",role_type="validator"} 5
aptos_connections{direction="outbound",network_id="Validator",peer_id="f326fd30",role_type="validator"} 2