Prometheus and Grafana
The Bifrost node and relayer also supports system metric monitoring. This guide will walk you through how to setup Prometheus and Grafana to monitor your node and relayer.
Enable Prometheus
In order to let your Bifrost node and relayer to collect Prometheus metrics, it must be manually enabled. To enable the Prometheus server of your node, the following CLI flags has to be provided and then restarted.
--prometheus-external
: This exposes the Prometheus exporter on all interfaces.--prometheus-port <PORT>
: The default port will be set to9615
. However, if port changes are required, then this flag must be provided.
In case that you're operating a full-node, to enable the Prometheus server of your relayer, the following parameters of your configuration YAML file has to be updated as below and restarted.
prometheus_config:
is_enabled: true
is_external: true
port: 8000
If it has been successfully restarted, in both of your services the following log will be printed at the initial launch.
2023-07-14 18:24:54 〽️ Prometheus exporter started at 0.0.0.0:9615
Using Systemd
This section contains how to install and setup Prometheus and Grafana by using Systemd.
Installing Prometheus
First, create the directories required to store the configuration and executable files.
sudo mkdir /etc/prometheus
sudo mkdir /var/lib/prometheus
Then, update your OS and install the latest Prometheus. You can check the latest releases by going to their GitHub repository under the releases page.
sudo apt-get update && apt-get upgrade
wget https://github.com/prometheus/prometheus/releases/download/v2.45.0/prometheus-2.45.0.linux-amd64.tar.gz
tar xfz prometheus-*.tar.gz
cd prometheus-2.45.0.linux-amd64
Copy the executable files to the /usr/local/bin/
directory.
sudo cp ./prometheus /usr/local/bin/
sudo cp ./promtool /usr/local/bin/
Copy the console files to the /etc/prometheus
directory.
sudo cp -r ./consoles /etc/prometheus
sudo cp -r ./console_libraries /etc/prometheus
Once everything is done, remove the prometheus
directory.
cd .. && rm -rf prometheus*
Installing NodeExporter
Now, install the NodeExporter. You can check the latest releases by going to their Github repository under the releases page.
wget https://github.com/prometheus/node_exporter/releases/download/v1.6.0/node_exporter-1.6.0.linux-amd64.tar.gz
tar xvf node_exporter-*.tar.gz
sudo cp ./node_exporter-*.linux-amd64/node_exporter /usr/local/bin/
rm -rf ./node_exporter*
Installing AlertManager
First, create the directories required to store the configuration and executable files.
sudo mkdir /etc/alertmanager
Next, install the AlertManager. You can check the latest releases by going to their Github repository under the releases page.
wget https://github.com/prometheus/alertmanager/releases/download/v0.25.0/alertmanager-0.25.0.linux-amd64.tar.gz
tar xvf alertmanager-*.tar.gz
sudo cp ./alertmanager-*.linux-amd64/alertmanager /usr/local/bin/
sudo cp ./alertmanager-*.linux-amd64/amtool /usr/local/bin/
rm -rf ./alertmanager*
Install the AlertManager plugins required for Grafana.
sudo grafana-cli plugins install camptocamp-prometheus-alertmanager-datasource
Configure Alert Rules
Create the rules.yml
file that will give the rules for the AlertManager.
sudo vi /etc/prometheus/rules.yml
We are going to create 2 basic rules that will trigger an alert in case the instance is down or the CPU usage crosses 80%. Add the following lines and save the file.
groups:
- name: alert_rules
rules:
- alert: InstanceDown
expr: up == 0
for: 5m
labels:
severity: critical
annotations:
summary: "Instance $labels.instance down"
description: "[{{ $labels.instance }}] of job [{{ $labels.job }}] has been down for more than 1 minute."
- alert: HostHighCpuLoad
expr: 100 - (avg by(instance)(rate(node_cpu_seconds_total{mode="idle"}[2m])) * 100) > 80
for: 0m
labels:
severity: warning
annotations:
summary: Host high CPU load (instance bLd Kusama)
description: "CPU load is > 80%\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"
The alertmanager.yml
file is used to set the external service that will be called when an alert is triggered. Here, we are going to use the Gmail notification.
Create the file in the following path.
sudo vi /etc/alertmanager/alertmanager.yml
And add the Gmail configuration to it and save the file as below.
global:
resolve_timeout: 1m
route:
receiver: 'gmail-notifications'
receivers:
- name: 'gmail-notifications'
email_configs:
- to: '' # receiver email
from: '' # sender(monitoring system) gmail
smarthost: 'smtp.gmail.com:587'
auth_username: '' # sender(monitoring system) gmail
auth_identity: '' # sender(monitoring system) gmail
auth_password: '' # sender(monitoring system) gmail's app password <https://support.google.com/mail/answer/185833?hl=en>
send_resolved: true
Configure Prometheus
In order to start Prometheus, it needs some configuration. Create a configuration yaml file in the following directory.
sudo vi /etc/prometheus/prometheus.yml
The configuration file should look as below.
global:
scrape_interval: 15s
evaluation_interval: 15s
rule_files:
- "rules.yml"
alerting:
alertmanagers:
- static_configs:
- targets:
- "localhost:9093"
scrape_configs:
- job_name: "prometheus"
scrape_interval: 5s
static_configs:
- targets: [ "localhost:9090" ]
- job_name: "bifrost_node"
scrape_interval: 5s
static_configs:
- targets: [ "localhost:9615" ]
- job_name: "node_exporter"
scrape_interval: 5s
static_configs:
- targets: [ "localhost:9100" ]
- job_name: "bifrost_relayer"
scrape_interval: 5s
static_configs:
- targets: [ "localhost:8000" ]
Starting Prometheus
Next, the Systemd configuration should be set for Prometheus. Create a configuration file in the following directory.
sudo vi /etc/systemd/system/prometheus.service
The configuration file should look as below.
[Unit]
Description=Prometheus Monitoring
Wants=network-online.target
After=network-online.target
[Service]
Type=simple
ExecStart=/usr/local/bin/prometheus \
--config.file /etc/prometheus/prometheus.yml \
--storage.tsdb.path /var/lib/prometheus/ \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries
ExecReload=/bin/kill -HUP $MAINPID
[Install]
WantedBy=multi-user.target
Now, enable and start the service.
sudo systemctl daemon-reload
sudo systemctl enable prometheus
sudo systemctl start prometheus
To test out if it all successfully worked, access YOUR_SERVER_IP_ADDRESS:9090
. If the Prometheus dashboard appears, it is good to go.
Starting NodeExporter
The Systemd configuration should be set for the NodeExporter. Crate a configuration file in the following directory.
sudo vi /etc/systemd/system/node_exporter.service
The configuration file should look as below.
[Unit]
Description=Node Exporter
Wants=network-online.target
After=network-online.target
[Service]
Type=simple
ExecStart=/usr/local/bin/node_exporter
[Install]
WantedBy=multi-user.target
Now, enable and start the service.
sudo systemctl enable node_exporter
sudo systemctl start node_exporter
Starting AlertManager
The Systemd configuration should be set for the AlertManager. Crate a configuration file in the following directory.
sudo vi /etc/systemd/system/alertmanager.service
The configuration file should look as below.
[Unit]
Description=AlertManager Server Service
Wants=network-online.target
After=network-online.target
[Service]
Type=simple
ExecStart=/usr/local/bin/alertmanager \
--config.file /etc/alertmanager/alertmanager.yml \
--storage.path /var/lib/alertmanager \
--web.external-url=http://localhost:9093 \
--cluster.advertise-address='0.0.0.0:9093'
[Install]
WantedBy=multi-user.target
Now, enable and start the service.
sudo systemctl enable alertmanager
sudo systemctl start alertmanager
Installing Grafana
To visualize your Prometheus metrics, you should install Grafana, which queries the Prometheus server. The latest releases can be checked on their download page. Execute the following commands to install the necessary dependencies.
sudo apt-get install -y adduser libfontconfig1
wget https://dl.grafana.com/enterprise/release/grafana-enterprise_10.0.1_amd64.deb
sudo dpkg -i grafana-enterprise_10.0.1_amd64.deb
Starting Grafana
Then enable and start the service with default configurations.
sudo systemctl daemon-reload
sudo systemctl enable grafana-server
sudo systemctl start grafana-server
You can now access it by going to YOUR_SERVER_IP_ADDRESS:3000/login
. The default user and password is admin/admin
.
Using Docker
This section contains how to install and setup Prometheus and Grafana by using Docker.
Requirements
First, Docker and Docker Compose should be installed in your server. Then you can download the docker-compose.yml
file that is provided in Bifrost node's Github repository. Download the file by using the command below. The file will be located in the maintenance
directory.
git clone https://github.com/bifrost-platform/bifrost-node.git
cd bifrost-node/maintenance
Configure AlertManager
The alert rules are pre-defined in the maintenance/prometheus/rules.yml
file. It contains 2 basic rules that will trigger an alert in case the instance is down or the CPU usage crosses 80%. The file will be provided as below.
groups:
- name: alert_rules
rules:
- alert: InstanceDown
expr: up == 0
for: 5m
labels:
severity: critical
annotations:
summary: "Instance $labels.instance down"
description: "[{{ $labels.instance }}] of job [{{ $labels.job }}] has been down for more than 1 minute."
- alert: HostHighCpuLoad
expr: 100 - (avg by(instance)(rate(node_cpu_seconds_total{mode="idle"}[2m])) * 100) > 80
for: 0m
labels:
severity: warning
annotations:
summary: Host high CPU load (instance bLd Kusama)
description: "CPU load is > 80%\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"
The alertmanager.yml
file is used to set the external service that will be called when an alert is triggered. Here, we are going to use the Gmail notification.
The file locates in the maintenance/alertmanager/alertmanager.yml
directory. Then, add the Gmail configuration to it and save the file as below.
global:
resolve_timeout: 1m
route:
receiver: 'gmail-notifications'
receivers:
- name: 'gmail-notifications'
email_configs:
- to: '' # receiver email
from: '' # sender(monitoring system) gmail
smarthost: 'smtp.gmail.com:587'
auth_username: '' # sender(monitoring system) gmail
auth_identity: '' # sender(monitoring system) gmail
auth_password: '' # sender(monitoring system) gmail's app password <https://support.google.com/mail/answer/185833?hl=en>
send_resolved: true
Configure Prometheus
In order to start Prometheus, it needs some configuration. The configuration file locates at maintenance/prometheus/prometheus.yml
. For Full-Node operators who runs the node and relayer both, should manually uncomment the below "relayer" job.
global:
scrape_interval: 3s
evaluation_interval: 3s
rule_files:
- "rules.yml"
alerting:
alertmanagers:
- static_configs:
- targets:
- "alertmanager:9093"
scrape_configs:
- job_name: "prometheus"
scrape_interval: 3s
static_configs:
- targets: ["localhost:9090"]
- job_name: "bifrost_node"
scrape_interval: 3s
static_configs:
- targets: ["host.docker.internal:9615"]
- job_name: "node_exporter"
scrape_interval: 3s
static_configs:
- targets: ["node-exporter:9100"]
# - job_name: "bifrost_relayer"
# scrape_interval: 3s
# static_configs:
# - targets: ["host.docker.internal:8000"]
Run Docker Containers
If you have followed every processes above, return to the maintenance
directory and execute the following command.
docker compose up -d
You can now access it by going to YOUR_SERVER_IP_ADDRESS:3000/login
. The default user and password is admin/admin
.
Datasource Configuration
If it is all set, create a new Prometheus datasource and input the URL
as http://localhost:9090
and then click “Save & Test” as below.

Then, create a new Prometheus AlertManager datasource and input the URL as http://localhost:9093
and then click "Save & Test" as below.

Next, the dashboard has to be imported. Access the "Dashboards" tab and click on "New" to import the dashboard as below.

Now, in the "Import via grafana.com" section, input the dashboard ID as 19207
and then click "Load" to continue.

If it has been successfully loaded, set the correct datasources that you have just created before. The Prometheus and the AlertManager has to be set correctly. Then click "Import" to continue.

In the meantime, if your node and relayer is running in the background, the collected metrics will be visualized as below.

Last updated