docker.recipes

Vector Log Pipeline

advanced

High-performance observability pipeline with Vector, ClickHouse storage, and Grafana.

Overview

Vector is a high-performance observability data pipeline developed by Datadog that transforms, routes, and processes logs, metrics, and traces at scale. Originally created to address the limitations of traditional log processing tools like Logstash and Fluentd, Vector provides better performance, memory safety through Rust, and more flexible data transformation capabilities. This pipeline configuration creates a complete log analytics solution where Vector collects and processes log data from multiple sources, stores it in ClickHouse's columnar database for fast analytical queries, and visualizes the results through Grafana dashboards. ClickHouse serves as the storage backend, providing sub-second query performance on billions of log entries through its column-oriented architecture and aggressive compression. The combination eliminates the need for expensive proprietary logging solutions while delivering enterprise-grade performance for log analytics, real-time monitoring, and operational insights. This stack particularly excels for organizations processing high-volume log data who need both real-time ingestion capabilities and fast historical analysis. Vector's ability to parse, filter, and transform logs in-flight combined with ClickHouse's analytical performance creates a powerful alternative to ELK stack deployments, especially when SQL-based log analysis is preferred over Elasticsearch's query DSL.

Key Features

  • Vector's high-performance log ingestion with support for 50+ data sources including syslog, Docker, Kubernetes, and file tailing
  • Real-time log transformation and filtering using Vector Remap Language (VRL) for field extraction and data enrichment
  • ClickHouse columnar storage with 10-100x faster analytical queries compared to traditional row-based log storage
  • Grafana ClickHouse data source integration for SQL-based dashboard creation and log exploration
  • Vector's backpressure handling and automatic retries ensuring reliable log delivery to ClickHouse
  • ClickHouse MergeTree engine optimized for time-series log data with automatic partitioning and TTL policies
  • NGINX reverse proxy providing unified access point and load balancing for the entire observability stack
  • Vector's built-in metrics and health endpoints for monitoring the pipeline performance itself

Common Use Cases

  • 1Application log analytics for debugging and performance monitoring in microservices architectures
  • 2Infrastructure monitoring for server logs, system metrics, and container orchestration platforms
  • 3Security information and event management (SIEM) with fast log correlation and threat detection queries
  • 4Business intelligence dashboards combining application logs with operational metrics for product insights
  • 5Compliance logging with long-term retention and fast audit trail queries using ClickHouse's compression
  • 6DevOps teams replacing expensive Splunk or DataDog deployments with open-source log analytics
  • 7Real-time alerting based on log patterns and anomaly detection using Grafana's alerting rules

Prerequisites

  • Minimum 8GB RAM recommended for ClickHouse analytical workloads and log retention
  • Docker host with sufficient disk I/O performance for high-volume log ingestion and ClickHouse storage
  • Network access to log sources that Vector will collect from (ports 514 for syslog, Docker socket access)
  • Basic understanding of SQL for creating ClickHouse schemas and Grafana dashboard queries
  • Familiarity with Vector Remap Language (VRL) for log parsing and transformation configuration
  • Available ports: 8686 (Vector API), 8123 (ClickHouse HTTP), 3000 (Grafana), 9000 (Vector sources)

For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms

docker-compose.yml

docker-compose.yml
1services:
2 vector:
3 image: timberio/vector:latest-alpine
4 ports:
5 - "8686:8686"
6 - "9000:9000"
7 volumes:
8 - ./vector.toml:/etc/vector/vector.toml:ro
9 - /var/log:/var/log:ro
10 - /var/run/docker.sock:/var/run/docker.sock:ro
11 depends_on:
12 - clickhouse
13 networks:
14 - vector-net
15 restart: unless-stopped
16
17 clickhouse:
18 image: clickhouse/clickhouse-server:latest
19 ports:
20 - "8123:8123"
21 - "9001:9000"
22 volumes:
23 - clickhouse_data:/var/lib/clickhouse
24 - ./clickhouse-config.xml:/etc/clickhouse-server/config.d/config.xml:ro
25 environment:
26 CLICKHOUSE_DB: ${CLICKHOUSE_DB}
27 CLICKHOUSE_USER: ${CLICKHOUSE_USER}
28 CLICKHOUSE_PASSWORD: ${CLICKHOUSE_PASSWORD}
29 ulimits:
30 nofile:
31 soft: 262144
32 hard: 262144
33 networks:
34 - vector-net
35 restart: unless-stopped
36
37 grafana:
38 image: grafana/grafana:latest
39 ports:
40 - "3000:3000"
41 environment:
42 GF_SECURITY_ADMIN_PASSWORD: ${GRAFANA_PASSWORD}
43 GF_INSTALL_PLUGINS: grafana-clickhouse-datasource
44 volumes:
45 - grafana_data:/var/lib/grafana
46 depends_on:
47 - clickhouse
48 networks:
49 - vector-net
50 restart: unless-stopped
51
52 nginx:
53 image: nginx:alpine
54 ports:
55 - "80:80"
56 volumes:
57 - ./nginx.conf:/etc/nginx/nginx.conf:ro
58 depends_on:
59 - grafana
60 networks:
61 - vector-net
62 restart: unless-stopped
63
64volumes:
65 clickhouse_data:
66 grafana_data:
67
68networks:
69 vector-net:
70 driver: bridge

.env Template

.env
1# ClickHouse
2CLICKHOUSE_DB=logs
3CLICKHOUSE_USER=vector
4CLICKHOUSE_PASSWORD=secure_clickhouse_password
5
6# Grafana
7GRAFANA_PASSWORD=secure_grafana_password

Usage Notes

  1. 1Vector API at http://localhost:8686
  2. 2ClickHouse HTTP at http://localhost:8123
  3. 3Grafana dashboards at http://localhost:3000
  4. 4Configure vector.toml for sources and sinks

Individual Services(4 services)

Copy individual services to mix and match with your existing compose files.

vector
vector:
  image: timberio/vector:latest-alpine
  ports:
    - "8686:8686"
    - "9000:9000"
  volumes:
    - ./vector.toml:/etc/vector/vector.toml:ro
    - /var/log:/var/log:ro
    - /var/run/docker.sock:/var/run/docker.sock:ro
  depends_on:
    - clickhouse
  networks:
    - vector-net
  restart: unless-stopped
clickhouse
clickhouse:
  image: clickhouse/clickhouse-server:latest
  ports:
    - "8123:8123"
    - "9001:9000"
  volumes:
    - clickhouse_data:/var/lib/clickhouse
    - ./clickhouse-config.xml:/etc/clickhouse-server/config.d/config.xml:ro
  environment:
    CLICKHOUSE_DB: ${CLICKHOUSE_DB}
    CLICKHOUSE_USER: ${CLICKHOUSE_USER}
    CLICKHOUSE_PASSWORD: ${CLICKHOUSE_PASSWORD}
  ulimits:
    nofile:
      soft: 262144
      hard: 262144
  networks:
    - vector-net
  restart: unless-stopped
grafana
grafana:
  image: grafana/grafana:latest
  ports:
    - "3000:3000"
  environment:
    GF_SECURITY_ADMIN_PASSWORD: ${GRAFANA_PASSWORD}
    GF_INSTALL_PLUGINS: grafana-clickhouse-datasource
  volumes:
    - grafana_data:/var/lib/grafana
  depends_on:
    - clickhouse
  networks:
    - vector-net
  restart: unless-stopped
nginx
nginx:
  image: nginx:alpine
  ports:
    - "80:80"
  volumes:
    - ./nginx.conf:/etc/nginx/nginx.conf:ro
  depends_on:
    - grafana
  networks:
    - vector-net
  restart: unless-stopped

Quick Start

terminal
1# 1. Create the compose file
2cat > docker-compose.yml << 'EOF'
3services:
4 vector:
5 image: timberio/vector:latest-alpine
6 ports:
7 - "8686:8686"
8 - "9000:9000"
9 volumes:
10 - ./vector.toml:/etc/vector/vector.toml:ro
11 - /var/log:/var/log:ro
12 - /var/run/docker.sock:/var/run/docker.sock:ro
13 depends_on:
14 - clickhouse
15 networks:
16 - vector-net
17 restart: unless-stopped
18
19 clickhouse:
20 image: clickhouse/clickhouse-server:latest
21 ports:
22 - "8123:8123"
23 - "9001:9000"
24 volumes:
25 - clickhouse_data:/var/lib/clickhouse
26 - ./clickhouse-config.xml:/etc/clickhouse-server/config.d/config.xml:ro
27 environment:
28 CLICKHOUSE_DB: ${CLICKHOUSE_DB}
29 CLICKHOUSE_USER: ${CLICKHOUSE_USER}
30 CLICKHOUSE_PASSWORD: ${CLICKHOUSE_PASSWORD}
31 ulimits:
32 nofile:
33 soft: 262144
34 hard: 262144
35 networks:
36 - vector-net
37 restart: unless-stopped
38
39 grafana:
40 image: grafana/grafana:latest
41 ports:
42 - "3000:3000"
43 environment:
44 GF_SECURITY_ADMIN_PASSWORD: ${GRAFANA_PASSWORD}
45 GF_INSTALL_PLUGINS: grafana-clickhouse-datasource
46 volumes:
47 - grafana_data:/var/lib/grafana
48 depends_on:
49 - clickhouse
50 networks:
51 - vector-net
52 restart: unless-stopped
53
54 nginx:
55 image: nginx:alpine
56 ports:
57 - "80:80"
58 volumes:
59 - ./nginx.conf:/etc/nginx/nginx.conf:ro
60 depends_on:
61 - grafana
62 networks:
63 - vector-net
64 restart: unless-stopped
65
66volumes:
67 clickhouse_data:
68 grafana_data:
69
70networks:
71 vector-net:
72 driver: bridge
73EOF
74
75# 2. Create the .env file
76cat > .env << 'EOF'
77# ClickHouse
78CLICKHOUSE_DB=logs
79CLICKHOUSE_USER=vector
80CLICKHOUSE_PASSWORD=secure_clickhouse_password
81
82# Grafana
83GRAFANA_PASSWORD=secure_grafana_password
84EOF
85
86# 3. Start the services
87docker compose up -d
88
89# 4. View logs
90docker compose logs -f

One-Liner

Run this command to download and set up the recipe in one step:

terminal
1curl -fsSL https://docker.recipes/api/recipes/vector-log-pipeline/run | bash

Troubleshooting

  • ClickHouse 'Memory limit exceeded' errors: Increase max_memory_usage setting or add more RAM to handle large analytical queries
  • Vector pipeline lag or dropped logs: Check backpressure configuration and ClickHouse insert performance, tune batch sizes in vector.toml
  • Grafana ClickHouse datasource connection failed: Verify ClickHouse HTTP interface is accessible on port 8123 and credentials are correct
  • Vector sources not receiving logs: Check file permissions for /var/log access and Docker socket permissions for container logs
  • ClickHouse out of disk space: Configure TTL policies for automatic log rotation and enable compression in MergeTree settings
  • High Vector CPU usage during parsing: Optimize VRL transform functions and consider using native ClickHouse parsing functions instead

Community Notes

Loading...
Loading notes...

Download Recipe Kit

Get all files in a ready-to-deploy package

Includes docker-compose.yml, .env template, README, and license

Ad Space