用 Docker 建立 Grafana 環境

Feb 09, 2022

grafana + telegraf + influxdb 是監控系統狀況很常用的一個組合,

但是要每次都要在機器上面安裝設定實在是很麻煩,所以想要用 Docker 包裝起來。

之後要監控機器時,只要用 docker-compose up 就可以設定完畢。

感謝萬能的 Github 在上面已經有不少相關的範例可以看,這次是從

GitHub - nicolargo/docker-influxdb-grafana: Docker-compose files for a simple InfluxDB + Grafana stack
Docker-compose files for a simple InfluxDB + Grafana stack - nicolargo/docker-influxdb-grafana
favicon
https://github.com/nicolargo/docker-influxdb-grafana
GitHub - nicolargo/docker-influxdb-grafana: Docker-compose files for a simple InfluxDB + Grafana stack

複製修改的

這個 Docker 設定沒有製作額外的 Docker Image,全部都是使用官方 Image。

因此感覺比較沒有安全上的疑慮。

docker-compose.yml

這個 docker-compose 主要是增加了 /proc/host 讓 container 可以監控整個 host 主機的狀況

以及增加了一個 nginx 做 reverse proxy 並且限制可連線 IP 範圍,

當然有 https 需求的話,換成 caddy 之類的 HTTP Server 也可以。

influxdb: image: influxdb:latest container_name: influxdb ports: - "127.0.0.1:8083:8083" - "127.0.0.1:8086:8086" - "127.0.0.1:8090:8090" env_file: - 'env.influxdb' volumes: # Data persistency - ./data/influxdb:/var/lib/influxdb telegraf: image: telegraf:latest container_name: telegraf links: - influxdb environment: - HOST_PROC=/host/proc volumes: - ./telegraf.conf:/etc/telegraf/telegraf.conf:ro - /proc:/host/proc:ro grafana: image: grafana/grafana:latest container_name: grafana ports: - "3000:3000" env_file: - 'env.grafana' user: "0" links: - influxdb volumes: # Data persistency - ./data/grafana:/var/lib/grafana nginx: image: nginx:latest ports: - "8080:8080" volumes: - ./nginx.conf:/etc/nginx/nginx.conf:ro links: - grafana

telegraf.conf

grafanahttps://grafana.com/grafana/dashboards/928 複製過來的

# Copy from https://grafana.com/grafana/dashboards/928 # Global tags can be specified here in key="value" format. [global_tags] # dc = "us-east-1" # will tag all metrics with dc=us-east-1 # rack = "1a" ## Environment variables can be used as tags, and throughout the config file # user = "$USER" # Configuration for telegraf agent [agent] interval = "10s" round_interval = true metric_batch_size = 1000 metric_buffer_limit = 10000 collection_jitter = "0s" flush_interval = "10s" flush_jitter = "0s" precision = "" debug = false quiet = false hostname = "" omit_hostname = false ### OUTPUT # Configuration for influxdb server to send metrics to [[outputs.influxdb]] urls = ["http://influxdb:8086"] database = "telegraf_metrics" ## Retention policy to write to. Empty string writes to the default rp. retention_policy = "" ## Write consistency (clusters only), can be: "any", "one", "quorum", "all" write_consistency = "any" ## Write timeout (for the InfluxDB client), formatted as a string. ## If not provided, will default to 5s. 0s means no timeout (not recommended). timeout = "5s" # username = "telegraf" # password = "2bmpiIeSWd63a7ew" ## Set the user agent for HTTP POSTs (can be useful for log differentiation) # user_agent = "telegraf" ## Set UDP payload size, defaults to InfluxDB UDP Client default (512 bytes) # udp_payload = 512 # Read metrics about cpu usage [[inputs.cpu]] ## Whether to report per-cpu stats or not percpu = true ## Whether to report total system cpu stats or not totalcpu = true ## Comment this line if you want the raw CPU time metrics fielddrop = ["time_*"] # Read metrics about disk usage by mount point [[inputs.disk]] ## By default, telegraf gather stats for all mountpoints. ## Setting mountpoints will restrict the stats to the specified mountpoints. # mount_points = ["/"] ## Ignore some mountpoints by filesystem type. For example (dev)tmpfs (usually ## present on /run, /var/run, /dev/shm or /dev). ignore_fs = ["tmpfs", "devtmpfs"] # Read metrics about disk IO by device [[inputs.diskio]] ## By default, telegraf will gather stats for all devices including ## disk partitions. ## Setting devices will restrict the stats to the specified devices. # devices = ["sda", "sdb"] ## Uncomment the following line if you need disk serial numbers. # skip_serial_number = false # Get kernel statistics from /proc/stat [[inputs.kernel]] # no configuration # Read metrics about memory usage [[inputs.mem]] # no configuration # Get the number of processes and group them by status [[inputs.processes]] # no configuration # Read metrics about swap memory usage [[inputs.swap]] # no configuration # Read metrics about system load & uptime [[inputs.system]] # no configuration # Read metrics about network interface usage [[inputs.net]] # collect data only about specific interfaces # interfaces = ["eth0"] [[inputs.netstat]] # no configuration [[inputs.interrupts]] # no configuration [[inputs.linux_sysctl_fs]] # no configuration

使用 nginx 做 reverse proxy

因為 Docker + iptables 的設定真的有點麻煩。

所以使用 nginx 當作 reverse proxy,順便加上 ip 限制。

nginx.conf 內容如下

worker_processes 1; pid /tmp/nginx.pid; events { worker_connections 1024; } http { # this is required to proxy Grafana Live WebSocket connections. map $http_upgrade $connection_upgrade { default upgrade; '' close; } include mime.types; default_type application/octet-stream; sendfile on; keepalive_timeout 65; server { listen 8080; server_name localhost; location / { proxy_pass http://grafana:3000/; } # Proxy Grafana Live WebSocket connections. location /api/live { rewrite ^/(.*) /$1 break; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection $connection_upgrade; proxy_set_header Host $http_host; proxy_pass http://grafana:3000/; } } }

增加 IP 存取限制

location / { allow 127.0.0.1; deny all; proxy_pass http://grafana:3000/; }

加上 IP 的限制,這樣就可以防止 container 開的 port 被隨意存取了

這些修改都已經 commit 到

GitHub - Swind/docker-influxdb-grafana: Docker-compose files for a simple InfluxDB + Grafana stack
Docker-compose files for a simple InfluxDB + Grafana stack - Swind/docker-influxdb-grafana
favicon
https://github.com/Swind/docker-influxdb-grafana
GitHub - Swind/docker-influxdb-grafana: Docker-compose files for a simple InfluxDB + Grafana stack

Reference

Run Grafana behind a reverse proxy | Grafana Labs
Learn how to run Grafana behind a reverse proxy
favicon
https://grafana.com/tutorials/run-grafana-behind-a-proxy/
Run Grafana behind a reverse proxy | Grafana Labs
← Go home