# Prometheus and Grafana

The Bifrost node and relayer also supports system metric monitoring. This guide will walk you through how to setup [Prometheus](https://prometheus.io/) and [Grafana](https://grafana.com/) to monitor your node and relayer.

## Enable Prometheus

In order to let your Bifrost node and relayer to collect Prometheus metrics, it must be manually enabled. To enable the Prometheus server of your node, the following CLI flags has to be provided and then restarted.

* `--prometheus-external` : This exposes the Prometheus exporter on all interfaces.
* `--prometheus-port <PORT>` : The default port will be set to `9615`. However, if port changes are required, then this flag must be provided.

In case that you're operating a full-node, to enable the Prometheus server of your relayer, the following parameters of your configuration YAML file has to be updated as below and restarted.

```yaml
prometheus_config:
  is_enabled: true
  is_external: true
  port: 8000
```

If it has been successfully restarted, in both of your services the following log will be printed at the initial launch.

```sh
2023-07-14 18:24:54 〽️ Prometheus exporter started at 0.0.0.0:9615
```

## Using Systemd

This section contains how to install and setup Prometheus and Grafana by using Systemd.

### Installing Prometheus

First, create the directories required to store the configuration and executable files.

```sh
sudo mkdir /etc/prometheus
sudo mkdir /var/lib/prometheus
```

Then, update your OS and install the latest Prometheus. You can check the latest releases by going to their GitHub repository under the [releases](https://github.com/prometheus/prometheus/releases/) page.

```sh
sudo apt-get update && apt-get upgrade
wget https://github.com/prometheus/prometheus/releases/download/v2.45.0/prometheus-2.45.0.linux-amd64.tar.gz
tar xfz prometheus-*.tar.gz
cd prometheus-2.45.0.linux-amd64
```

Copy the executable files to the `/usr/local/bin/` directory.

```sh
sudo cp ./prometheus /usr/local/bin/
sudo cp ./promtool /usr/local/bin/
```

Copy the console files to the `/etc/prometheus` directory.

```sh
sudo cp -r ./consoles /etc/prometheus
sudo cp -r ./console_libraries /etc/prometheus
```

Once everything is done, remove the `prometheus` directory.

```sh
cd .. && rm -rf prometheus*
```

### Installing NodeExporter

Now, install the NodeExporter. You can check the latest releases by going to their Github repository under the [releases](https://github.com/prometheus/node_exporter/releases) page.

```sh
wget https://github.com/prometheus/node_exporter/releases/download/v1.6.0/node_exporter-1.6.0.linux-amd64.tar.gz
tar xvf node_exporter-*.tar.gz
sudo cp ./node_exporter-*.linux-amd64/node_exporter /usr/local/bin/
rm -rf ./node_exporter*
```

### Installing AlertManager

First, create the directories required to store the configuration and executable files.

```sh
sudo mkdir /etc/alertmanager
```

Next, install the AlertManager. You can check the latest releases by going to their Github repository under the [releases](https://github.com/prometheus/alertmanager/releases) page.

```sh
wget https://github.com/prometheus/alertmanager/releases/download/v0.25.0/alertmanager-0.25.0.linux-amd64.tar.gz
tar xvf alertmanager-*.tar.gz
sudo cp ./alertmanager-*.linux-amd64/alertmanager /usr/local/bin/
sudo cp ./alertmanager-*.linux-amd64/amtool /usr/local/bin/
rm -rf ./alertmanager*
```

Install the AlertManager plugins required for Grafana.

```sh
sudo grafana-cli plugins install camptocamp-prometheus-alertmanager-datasource
```

### Configure Alert Rules

Create the `rules.yml` file that will give the rules for the AlertManager.

```sh
sudo vi /etc/prometheus/rules.yml
```

We are going to create 2 basic rules that will trigger an alert in case the instance is down or the CPU usage crosses 80%. Add the following lines and save the file.

```yaml
groups:
  - name: alert_rules
    rules:
      - alert: InstanceDown
        expr: up == 0
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Instance $labels.instance down"
          description: "[{{ $labels.instance }}] of job [{{ $labels.job }}] has been down for more than 1 minute."

      - alert: HostHighCpuLoad
        expr: 100 - (avg by(instance)(rate(node_cpu_seconds_total{mode="idle"}[2m])) * 100) > 80
        for: 0m
        labels:
          severity: warning
        annotations:
          summary: Host high CPU load (instance bLd Kusama)
          description: "CPU load is > 80%\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}"
```

The `alertmanager.yml` file is used to set the external service that will be called when an alert is triggered. Here, we are going to use the Gmail notification.

{% hint style="info" %}
For Gmail notification, you will need to generate an app password. We recommend you to use a dedicated email address for your alerts. In order to set-up follow this [link](https://support.google.com/mail/answer/185833?hl=en).
{% endhint %}

Create the file in the following path.

```sh
sudo vi /etc/alertmanager/alertmanager.yml
```

And add the Gmail configuration to it and save the file as below.

```yaml
global:
  resolve_timeout: 1m

route:
  receiver: 'gmail-notifications'

receivers:
  - name: 'gmail-notifications'
    email_configs:
      - to: '' # receiver email
        from: '' # sender(monitoring system) gmail
        smarthost: 'smtp.gmail.com:587'
        auth_username: '' # sender(monitoring system) gmail
        auth_identity: '' # sender(monitoring system) gmail
        auth_password: '' # sender(monitoring system) gmail's app password <https://support.google.com/mail/answer/185833?hl=en>
        send_resolved: true
```

<details>

<summary>Example</summary>

```yaml
global:
  resolve_timeout: 1m

route:
  receiver: 'gmail-notifications'

receivers:
  - name: 'gmail-notifications'
    email_configs:
      - to: 'receiver-example@gmail.com'
        from: 'sender-example@gmail.com'
        smarthost: 'smtp.gmail.com:587'
        auth_username: 'sender-example@gmail.com'
        auth_identity: 'sender-example@gmail.com'
        auth_password: 'my-auth-password'
        send_resolved: true
```

</details>

### Configure Prometheus

In order to start Prometheus, it needs some configuration. Create a configuration yaml file in the following directory.

```sh
sudo vi /etc/prometheus/prometheus.yml
```

The configuration file should look as below.

<pre class="language-yaml"><code class="lang-yaml">global:
  scrape_interval: 15s
  evaluation_interval: 15s

rule_files:
  - "rules.yml"
  
<strong>alerting:
</strong>  alertmanagers:
    - static_configs:
        - targets:
            - "localhost:9093"

scrape_configs:
  - job_name: "prometheus"
    scrape_interval: 5s
    static_configs:
      - targets: [ "localhost:9090" ]
  - job_name: "bifrost_node"
    scrape_interval: 5s
    static_configs:
      - targets: [ "localhost:9615" ]
  - job_name: "node_exporter"
    scrape_interval: 5s
    static_configs:
      - targets: [ "localhost:9100" ]
  - job_name: "bifrost_relayer"
    scrape_interval: 5s
    static_configs:
      - targets: [ "localhost:8000" ]
</code></pre>

### Starting Prometheus

Next, the Systemd configuration should be set for Prometheus. Create a configuration file in the following directory.

```sh
sudo vi /etc/systemd/system/prometheus.service
```

The configuration file should look as below.

```systemd
[Unit]
Description=Prometheus Monitoring
Wants=network-online.target
After=network-online.target

[Service]
Type=simple
ExecStart=/usr/local/bin/prometheus \
  --config.file /etc/prometheus/prometheus.yml \
  --storage.tsdb.path /var/lib/prometheus/ \
  --web.console.templates=/etc/prometheus/consoles \
  --web.console.libraries=/etc/prometheus/console_libraries
ExecReload=/bin/kill -HUP $MAINPID

[Install]
WantedBy=multi-user.target
```

Now, enable and start the service.

```sh
sudo systemctl daemon-reload
sudo systemctl enable prometheus
sudo systemctl start prometheus
```

To test out if it all successfully worked, access `YOUR_SERVER_IP_ADDRESS:9090`. If the Prometheus dashboard appears, it is good to go.

### Starting NodeExporter

The Systemd configuration should be set for the NodeExporter. Crate a configuration file in the following directory.

```sh
sudo vi /etc/systemd/system/node_exporter.service
```

The configuration file should look as below.

```systemd
[Unit]
Description=Node Exporter
Wants=network-online.target
After=network-online.target

[Service] 
Type=simple
ExecStart=/usr/local/bin/node_exporter

[Install]
WantedBy=multi-user.target
```

Now, enable and start the service.

```sh
sudo systemctl enable node_exporter
sudo systemctl start node_exporter
```

### Starting AlertManager

The Systemd configuration should be set for the AlertManager. Crate a configuration file in the following directory.

```sh
sudo vi /etc/systemd/system/alertmanager.service
```

The configuration file should look as below.

```systemd
[Unit]
Description=AlertManager Server Service
Wants=network-online.target
After=network-online.target

[Service]
Type=simple
ExecStart=/usr/local/bin/alertmanager \
--config.file /etc/alertmanager/alertmanager.yml \
--storage.path /var/lib/alertmanager \
--web.external-url=http://localhost:9093 \
--cluster.advertise-address='0.0.0.0:9093'

[Install]
WantedBy=multi-user.target
```

Now, enable and start the service.

```sh
sudo systemctl enable alertmanager
sudo systemctl start alertmanager
```

### Installing Grafana

To visualize your Prometheus metrics, you should install Grafana, which queries the Prometheus server. The latest releases can be checked on their [download page](https://grafana.com/grafana/download). Execute the following commands to install the necessary dependencies.

```sh
sudo apt-get install -y adduser libfontconfig1
wget https://dl.grafana.com/enterprise/release/grafana-enterprise_10.0.1_amd64.deb
sudo dpkg -i grafana-enterprise_10.0.1_amd64.deb
```

### Starting Grafana

Then enable and start the service with default configurations.

```sh
sudo systemctl daemon-reload
sudo systemctl enable grafana-server
sudo systemctl start grafana-server
```

You can now access it by going to `YOUR_SERVER_IP_ADDRESS:3000/login`. The default user and password is `admin/admin`.

## Using Docker

This section contains how to install and setup Prometheus and Grafana by using Docker.

### Requirements

First, Docker and Docker Compose should be installed in your server. Then you can download the `docker-compose.yml` file that is provided in Bifrost node's [Github](https://github.com/bifrost-platform/bifrost-node/blob/main/maintenance/docker-compose.yml) repository. Download the file by using the command below. The file will be located in the `maintenance` directory.

```sh
git clone https://github.com/bifrost-platform/bifrost-node.git
cd bifrost-node/maintenance
```

### Configure AlertManager

The alert rules are pre-defined in the `maintenance/prometheus/rules.yml` file. It contains 2 basic rules that will trigger an alert in case the instance is down or the CPU usage crosses 80%. The file will be provided as below.

```yaml
groups:
  - name: alert_rules
    rules:
      - alert: InstanceDown
        expr: up == 0
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Instance $labels.instance down"
          description: "[{{ $labels.instance }}] of job [{{ $labels.job }}] has been down for more than 1 minute."

      - alert: HostHighCpuLoad
        expr: 100 - (avg by(instance)(rate(node_cpu_seconds_total{mode="idle"}[2m])) * 100) > 80
        for: 0m
        labels:
          severity: warning
        annotations:
          summary: Host high CPU load (instance bLd Kusama)
          description: "CPU load is > 80%\n  VALUE = {{ $value }}\n  LABELS: {{ $labels }}"
```

The `alertmanager.yml` file is used to set the external service that will be called when an alert is triggered. Here, we are going to use the Gmail notification.

{% hint style="info" %}
For Gmail notification, you will need to generate an app password. We recommend you to use a dedicated email address for your alerts. In order to set-up follow this [link](https://support.google.com/mail/answer/185833?hl=en).
{% endhint %}

The file locates in the `maintenance/alertmanager/alertmanager.yml` directory. Then, add the Gmail configuration to it and save the file as below.

```yaml
global:
  resolve_timeout: 1m

route:
  receiver: 'gmail-notifications'

receivers:
  - name: 'gmail-notifications'
    email_configs:
      - to: '' # receiver email
        from: '' # sender(monitoring system) gmail
        smarthost: 'smtp.gmail.com:587'
        auth_username: '' # sender(monitoring system) gmail
        auth_identity: '' # sender(monitoring system) gmail
        auth_password: '' # sender(monitoring system) gmail's app password <https://support.google.com/mail/answer/185833?hl=en>
        send_resolved: true
```

<details>

<summary>Example</summary>

```yaml
global:
  resolve_timeout: 1m

route:
  receiver: 'gmail-notifications'

receivers:
  - name: 'gmail-notifications'
    email_configs:
      - to: 'receiver-example@gmail.com'
        from: 'sender-example@gmail.com'
        smarthost: 'smtp.gmail.com:587'
        auth_username: 'sender-example@gmail.com'
        auth_identity: 'sender-example@gmail.com'
        auth_password: 'my-auth-password'
        send_resolved: true
```

</details>

### Configure Prometheus

In order to start Prometheus, it needs some configuration. The configuration file locates at `maintenance/prometheus/prometheus.yml`. For Full-Node operators who runs the node and relayer both, should manually uncomment the below "relayer" job.

```yaml
global:
  scrape_interval: 3s
  evaluation_interval: 3s

rule_files:
  - "rules.yml"

alerting:
  alertmanagers:
    - static_configs:
        - targets:
            - "alertmanager:9093"

scrape_configs:
  - job_name: "prometheus"
    scrape_interval: 3s
    static_configs:
      - targets: ["localhost:9090"]
  - job_name: "bifrost_node"
    scrape_interval: 3s
    static_configs:
      - targets: ["host.docker.internal:9615"]
  - job_name: "node_exporter"
    scrape_interval: 3s
    static_configs:
      - targets: ["node-exporter:9100"]
  # - job_name: "bifrost_relayer"
  #   scrape_interval: 3s
  #   static_configs:
  #     - targets: ["host.docker.internal:8000"]
```

### Run Docker Containers

If you have followed every processes above, return to the `maintenance` directory and execute the following command.

```sh
docker compose up -d
```

You can now access it by going to `YOUR_SERVER_IP_ADDRESS:3000/login`. The default user and password is `admin/admin`.

## Datasource Configuration

If it is all set, create a new Prometheus datasource and input the `URL` as `http://localhost:9090` and then click “Save & Test” as below.

{% hint style="info" %}
For Docker users, the URL should be set to `http://prometheus:9090`.
{% endhint %}

<figure><img src="https://3596680066-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FISx9P4kYm2Zecfailc61%2Fuploads%2F59F8dsFi5IBBxkpU9GCE%2F%E1%84%89%E1%85%B3%E1%84%8F%E1%85%B3%E1%84%85%E1%85%B5%E1%86%AB%E1%84%89%E1%85%A3%E1%86%BA%202023-07-17%20%E1%84%8B%E1%85%A9%E1%84%92%E1%85%AE%205.41.32.png?alt=media&#x26;token=2cc328a2-9bfd-427f-b4d4-1ca8a0aab844" alt=""><figcaption></figcaption></figure>

Then, create a new Prometheus AlertManager datasource and input the URL as `http://localhost:9093` and then click "Save & Test" as below.

{% hint style="info" %}
For Docker users, the URL should be set to `http://alertmanager:9093`.
{% endhint %}

<figure><img src="https://3596680066-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FISx9P4kYm2Zecfailc61%2Fuploads%2FQxZhoFCuIL12kX8jx3Rt%2F%E1%84%89%E1%85%B3%E1%84%8F%E1%85%B3%E1%84%85%E1%85%B5%E1%86%AB%E1%84%89%E1%85%A3%E1%86%BA%202023-07-17%20%E1%84%8B%E1%85%A9%E1%84%92%E1%85%AE%205.42.25.png?alt=media&#x26;token=f92b0d02-f7dc-44b8-8246-e3ff0c846e2d" alt=""><figcaption></figcaption></figure>

Next, the dashboard has to be imported. Access the "Dashboards" tab and click on "New" to import the dashboard as below.

<figure><img src="https://3596680066-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FISx9P4kYm2Zecfailc61%2Fuploads%2FRnIInJUWMMLZj79fGnQ5%2F%E1%84%89%E1%85%B3%E1%84%8F%E1%85%B3%E1%84%85%E1%85%B5%E1%86%AB%E1%84%89%E1%85%A3%E1%86%BA%202023-07-17%20%E1%84%8B%E1%85%A9%E1%84%92%E1%85%AE%205.45.24.png?alt=media&#x26;token=4f440318-02ea-43ae-87b1-8543cacb0d9c" alt=""><figcaption></figcaption></figure>

Now, in the "Import via grafana.com" section, input the dashboard ID as `19207` and then click "Load" to continue.

<figure><img src="https://3596680066-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FISx9P4kYm2Zecfailc61%2Fuploads%2FAtLN5CtqGFOtE2ZasD4s%2F%E1%84%89%E1%85%B3%E1%84%8F%E1%85%B3%E1%84%85%E1%85%B5%E1%86%AB%E1%84%89%E1%85%A3%E1%86%BA%202023-07-17%20%E1%84%8B%E1%85%A9%E1%84%92%E1%85%AE%205.46.24.png?alt=media&#x26;token=e9e62275-abe2-460c-a4e5-542cd208367c" alt=""><figcaption></figcaption></figure>

If it has been successfully loaded, set the correct datasources that you have just created before. The Prometheus and the AlertManager has to be set correctly. Then click "Import" to continue.

<figure><img src="https://3596680066-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FISx9P4kYm2Zecfailc61%2Fuploads%2FzHjv0caK87Qy09GRcTQ0%2F%E1%84%89%E1%85%B3%E1%84%8F%E1%85%B3%E1%84%85%E1%85%B5%E1%86%AB%E1%84%89%E1%85%A3%E1%86%BA%202023-07-17%20%E1%84%8B%E1%85%A9%E1%84%92%E1%85%AE%205.48.39.png?alt=media&#x26;token=10b3a909-578d-4ff6-ae16-cc84c42d051f" alt=""><figcaption></figcaption></figure>

In the meantime, if your node and relayer is running in the background, the collected metrics will be visualized as below.

<figure><img src="https://3596680066-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FISx9P4kYm2Zecfailc61%2Fuploads%2FMcwa6vVMrwMRtzq9xXq9%2F%E1%84%89%E1%85%B3%E1%84%8F%E1%85%B3%E1%84%85%E1%85%B5%E1%86%AB%E1%84%89%E1%85%A3%E1%86%BA%202023-07-17%20%E1%84%8B%E1%85%A9%E1%84%92%E1%85%AE%205.50.54.png?alt=media&#x26;token=efba97dd-e4ce-4255-898b-8638563fcfb3" alt=""><figcaption></figcaption></figure>
