Add Logs To Prometheus

A quick summary of Add Logs To Prometheus

Add Fluentd logs to Prometheus metrics

Preparation

Install prometheus plugin on logs server for send logs to prometheus : sudo td-agent-gem install fluent-plugin-prometheus

After plugin instalation need to configure td-agent (fluentd) config files to receive and send logs.

In this instrution will be send nginx error and access logs.

Client server Fluentd setup

On a client server side only collecting and matching needed logs from services. All visualiazation creates on logs server.
Sctrutcure of td-agent directory

/etc/td-agent/
├── nginx.access.conf
├── nginx.error.conf
├── plugin
├── td-agent.conf 

Configuration files nginx.access and nginx.error using for collect and filter logs from nginx service.
Configuration file td-agent.conf using only for main send or receive logs.

Structure of nginx configuration client files

Default nginx.access.conf file should be look like this

# Count number of nginx access logs by tag
<source>
  @type tail
  format nginx
  path /var/log/nginx/access.log
  pos_file /var/log/td-agent/log/nginx.access.pos
  format /^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)" "(?<forwarder>[^\"]*)")?/
  time_format %d/%b/%Y:%H:%M:%S %z
  tag nginx.access #Name of TAG. Used for matching logs on logging server
</source>

# Match nginx access logs and send it to log server
<match nginx.access>
  @type copy
  <store>
    @type forward 
    <server>
      name myserver1
      host 10.10.2.11
      port 24224
      weight 60
    </server>
  </store>
</match>

Default nginx.error file will be look almoust the same

# Count number of nginx error logs by tag
<source>
  @type tail
  format nginx
  path /var/log/nginx/error.log
  pos_file /var/log/td-agent/log/nginx.error.pos
  format /(?<time>[^ ]* [^ ]*) +(?<method>[^ ]*) +(?<path>[^ ]*) +(?<message>[^ ].*$)/
  tag nginx.error #Name of TAG. Used for matching logs on logging server
</source>


# Match nginx error logs and sent it to log server
<match nginx.error>
  @type copy
  <store>
    @type forward
    <server>
      name myserver1
      host 10.10.2.11
      port 24224
      weight 60
    </server>
  </store>
</match>

Structure of td-agent client file

Default td.agent.conf file

#Rsyslog configs
<source>
  @type syslog
  port 5140
  tag 10.10.2.12
</source>


# Add nginx config files to td-agent file.
@include nginx.error.conf
@include nginx.access.conf

Logs server Fluentd setup

Sctrutcure of td-agent directory

/etc/td-agent/
├── nginx.access.conf
├── nginx.error.conf
├── plugin
└── td-agent.conf

As on the client server tg-agent.conf using only for send logs to monitoring on prometheus format.
Receiving and matching logs creating on nginx.error.conf and nginx.error.conf files

Structure of nginx configuration log server files

Default nginx.access.config file

# Filtering nginx access logs by tag and use prometheus plugin to send it to metrics

<filter nginx.access>
  @type prometheus
  <metric>
    name nginx_access_records_total
    type counter
    desc The total number of incoming access
    <labels>
     tag ${tag}
     hostname ${hostname}
    </labels>
  </metric>
</filter>

##Match which logs to receive and where to collect
<match nginx.access>
  @type copy
  <store>
   @type file
   path /var/log/td-agent/nginx.access
   compress gzip
   <buffer>
    timekey 1d
    timekey_use_utc true
    timekey_wait 10m
   </buffer>
  </store>
</match>

Default nginx.error.conf file almoust the same

# Filtering nginx error logs by tag and use prometheus plugin to send it to metrics

<filter nginx.error>
  @type prometheus
  <metric>
    name nginx_error_records_total
    type counter
    desc The total number of incoming errors
    <labels>
      tag ${tag}
      hostname ${hostname}
    </labels>
  </metric>
</filter>

##Match which logs to receive and where to collect
<match nginx.error>
  @type copy
  <store>
   @type file
   path /var/log/td-agent/nginx.error
   compress gzip
   <buffer>
    timekey 1d
    timekey_use_utc true
    timekey_wait 10m`
   </buffer>
  </store>
</match>

Structure of td-agent log server file


## Address and port to receive logs

<source>
  @type forward
  bind 0.0.0.0
  port 24224
</source>

# expose metrics in prometheus format
<source>
  @type prometheus
  bind 0.0.0.0
  port 24231
  metrics_path /metrics
</source>

# Add nginx config files to td-agent file.

@include nginx.access.conf
@include nginx.error.conf

Check tg-agent config file

After add all configs need to be sure that all configs are correct.
So run this command (on both servers) : sudo tg-agent -c /etc/td-agent/td-agent.cong

Result should be like this:

2019-07-03 07:29:57 +0000 [info]: parsing config file is succeeded path="/etc/td-agent/td-agent.conf"
2019-07-03 07:29:57 +0000 [info]: using configuration file: <ROOT>
  <source>
    @type forward
    bind "0.0.0.0"
    port 24224
  </source>
  <source>
    @type prometheus
    bind "0.0.0.0"
    port 24231
    metrics_path "/metrics"
  </source>
  <filter nginx.access>
    @type prometheus
    <metric>
      name nginx_access_records_total
      type counter
      desc The total number of incoming access
      <labels>
        tag ${tag}
        hostname ${hostname}
      </labels>
    </metric>
  </filter>
  <match nginx.access>
    @type copy
    <store>
      @type "file"
      path "/var/log/td-agent/nginx.access"
      compress gzip
      <buffer>
        timekey 1d
        timekey_use_utc true
        timekey_wait 10m
        path "/var/log/td-agent/nginx.access"
      </buffer>
    </store>
  </match>
  <filter nginx.error>
    @type prometheus
    <metric>
      name nginx_error_records_total
      type counter
      desc The total number of incoming errors
      <labels>
        tag ${tag}
        hostname ${hostname}
      </labels>
    </metric>
  </filter>
  <match nginx.error>
    @type copy
    <store>
      @type "file"
      path "/var/log/td-agent/nginx.error"
      compress gzip
      <buffer>
        timekey 1d
        timekey_use_utc true
        timekey_wait 10m`
        path "/var/log/td-agent/nginx.error"
      </buffer>
    </store>
  </match>
</ROOT>
2019-07-03 07:29:57 +0000 [info]: starting fluentd-1.4.2 pid=10594 ruby="2.4.6"
2019-07-03 07:29:57 +0000 [info]: spawn command to main:  cmdline=["/opt/td-agent/embedded/bin/ruby", "-Eascii-8bit:ascii-8bit", "/usr/sbin/td-agent", "-c", "/etc/td-agent/td-agent.conf", "--under-supervisor"]
2019-07-03 07:29:58 +0000 [info]: gem 'fluent-plugin-elasticsearch' version '3.5.1'
2019-07-03 07:29:58 +0000 [info]: gem 'fluent-plugin-kafka' version '0.9.4'
2019-07-03 07:29:58 +0000 [info]: gem 'fluent-plugin-prometheus' version '1.4.0'
2019-07-03 07:29:58 +0000 [info]: gem 'fluent-plugin-prometheus' version '1.0.1'
2019-07-03 07:29:58 +0000 [info]: gem 'fluent-plugin-record-modifier' version '2.0.1'
2019-07-03 07:29:58 +0000 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '2.2.0'
2019-07-03 07:29:58 +0000 [info]: gem 'fluent-plugin-s3' version '1.1.10'
2019-07-03 07:29:58 +0000 [info]: gem 'fluent-plugin-secure-forward' version '0.4.5'
2019-07-03 07:29:58 +0000 [info]: gem 'fluent-plugin-td' version '1.0.0'
2019-07-03 07:29:58 +0000 [info]: gem 'fluent-plugin-td-monitoring' version '0.2.4'
2019-07-03 07:29:58 +0000 [info]: gem 'fluent-plugin-webhdfs' version '1.2.3'
2019-07-03 07:29:58 +0000 [info]: gem 'fluentd' version '1.4.2'
2019-07-03 07:29:58 +0000 [info]: adding filter pattern="nginx.access" type="prometheus"
2019-07-03 07:29:58 +0000 [info]: adding match pattern="nginx.access" type="copy"
2019-07-03 07:29:58 +0000 [info]: adding filter pattern="nginx.error" type="prometheus"
2019-07-03 07:29:58 +0000 [info]: adding match pattern="nginx.error" type="copy"
2019-07-03 07:29:58 +0000 [info]: adding source type="forward"
2019-07-03 07:29:58 +0000 [info]: adding source type="prometheus"
2019-07-03 07:29:58 +0000 [info]: #0 starting fluentd worker pid=10601 ppid=10594 worker=0
2019-07-03 07:29:58 +0000 [info]: #0 listening port port=24224 bind="0.0.0.0"
2019-07-03 07:29:58 +0000 [info]: #0 fluentd worker is now running worker=0

If all configs works fine and no errors, after some time in localhost:24321/metrics should add created metrics:

# TYPE nginx_access_records_total counter
# HELP nginx_access_records_total The total number of incoming access
nginx_access_records_total{tag="nginx.access",hostname="ubuntu"} 236.0
# TYPE nginx_error_records_total counter
# HELP nginx_error_records_total The total number of incoming errors
nginx_error_records_total{tag="nginx.error",hostname="ubuntu"} 42.0

After that, this metric could ne added to alert rules in Prometheus.