docker.recipes

Vector Log Aggregator Pipeline

intermediate

High-performance observability pipeline for logs, metrics, and traces.

Overview

Vector is a high-performance observability data pipeline built in Rust by Datadog that efficiently collects, transforms, and routes logs, metrics, and traces. Originally developed to address the performance limitations and memory overhead of traditional log aggregators like Fluentd and Logstash, Vector can process millions of events per second with minimal resource consumption. This stack combines Vector's powerful data processing capabilities with ClickHouse's columnar storage engine for analytics, Prometheus for metrics collection, and Grafana for visualization. The combination creates a complete observability pipeline where Vector acts as the central data router, ingesting logs from multiple sources, transforming and enriching the data, then distributing it to ClickHouse for long-term storage and analysis while exposing its own operational metrics to Prometheus. This architecture is particularly valuable for organizations that need to process large volumes of observability data while maintaining control over their data pipeline, costs, and performance. The stack offers enterprise-grade observability capabilities without the recurring costs of SaaS solutions, making it ideal for companies with significant log volumes, compliance requirements, or those seeking to build custom analytics on their observability data.

Key Features

  • Vector's memory-mapped buffer system ensures data durability and prevents data loss during restarts or failures
  • ClickHouse's columnar storage provides sub-second query performance on billions of log entries with automatic data compression
  • Vector's VRL (Vector Remap Language) enables complex log parsing, enrichment, and transformation without external dependencies
  • Prometheus integration captures Vector's internal metrics including throughput, error rates, and buffer utilization
  • ClickHouse materialized views automatically aggregate log data for faster dashboard queries and reporting
  • Vector's adaptive concurrency automatically optimizes sink throughput based on downstream system performance
  • Grafana's ClickHouse data source plugin enables SQL-based log exploration with time-series visualization
  • Vector's topology health checks monitor sink connectivity and automatically buffer data during outages

Common Use Cases

  • 1High-volume application log aggregation for e-commerce platforms processing millions of transactions daily
  • 2Security operations center (SOC) log analysis pipeline for threat detection and incident response
  • 3Multi-tenant SaaS platform centralized logging with per-tenant data isolation and cost tracking
  • 4Kubernetes cluster observability with pod log collection, metric extraction, and distributed tracing correlation
  • 5Compliance logging for financial services requiring tamper-proof audit trails and long-term retention
  • 6Real-time fraud detection system processing payment logs with complex pattern matching and alerting
  • 7DevOps pipeline monitoring aggregating CI/CD logs, deployment metrics, and infrastructure health data

Prerequisites

  • Minimum 8GB RAM recommended (2GB for ClickHouse, 1GB for Prometheus, 512MB for Grafana, remainder for Vector buffers)
  • Basic understanding of TOML configuration syntax for Vector pipeline definitions
  • Familiarity with PromQL for creating custom metrics dashboards and alerts
  • Knowledge of ClickHouse SQL dialect differences from standard SQL for log analysis queries
  • Available ports 3000, 8123, 8686, 9000, 9090, and 514/6514 for service communication
  • Understanding of log formats (JSON, syslog, CEF) that will be ingested by Vector

For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms

docker-compose.yml

docker-compose.yml
1services:
2 vector:
3 image: timberio/vector:latest-alpine
4 ports:
5 - "8686:8686"
6 - "9598:9598"
7 - "514:514/udp"
8 - "6514:6514"
9 volumes:
10 - ./vector.toml:/etc/vector/vector.toml:ro
11 - vector_data:/var/lib/vector
12 networks:
13 - vector_net
14
15 clickhouse:
16 image: clickhouse/clickhouse-server:latest
17 ports:
18 - "8123:8123"
19 - "9000:9000"
20 environment:
21 - CLICKHOUSE_USER=${CLICKHOUSE_USER}
22 - CLICKHOUSE_PASSWORD=${CLICKHOUSE_PASSWORD}
23 - CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=1
24 volumes:
25 - clickhouse_data:/var/lib/clickhouse
26 networks:
27 - vector_net
28
29 prometheus:
30 image: prom/prometheus:latest
31 ports:
32 - "9090:9090"
33 volumes:
34 - ./prometheus.yml:/etc/prometheus/prometheus.yml
35 - prometheus_data:/prometheus
36 networks:
37 - vector_net
38
39 grafana:
40 image: grafana/grafana:latest
41 ports:
42 - "3000:3000"
43 environment:
44 - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
45 volumes:
46 - grafana_data:/var/lib/grafana
47 depends_on:
48 - clickhouse
49 - prometheus
50 networks:
51 - vector_net
52
53volumes:
54 vector_data:
55 clickhouse_data:
56 prometheus_data:
57 grafana_data:
58
59networks:
60 vector_net:

.env Template

.env
1# Vector Pipeline
2CLICKHOUSE_USER=default
3CLICKHOUSE_PASSWORD=secure_clickhouse_password
4GRAFANA_PASSWORD=secure_grafana_password
5
6# Vector API at http://localhost:8686
7# ClickHouse at http://localhost:8123

Usage Notes

  1. 1Vector API at http://localhost:8686
  2. 2Syslog input at port 514/UDP
  3. 3ClickHouse for log storage
  4. 4Configure vector.toml for pipelines
  5. 5Written in Rust for performance

Individual Services(4 services)

Copy individual services to mix and match with your existing compose files.

vector
vector:
  image: timberio/vector:latest-alpine
  ports:
    - "8686:8686"
    - "9598:9598"
    - 514:514/udp
    - "6514:6514"
  volumes:
    - ./vector.toml:/etc/vector/vector.toml:ro
    - vector_data:/var/lib/vector
  networks:
    - vector_net
clickhouse
clickhouse:
  image: clickhouse/clickhouse-server:latest
  ports:
    - "8123:8123"
    - "9000:9000"
  environment:
    - CLICKHOUSE_USER=${CLICKHOUSE_USER}
    - CLICKHOUSE_PASSWORD=${CLICKHOUSE_PASSWORD}
    - CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=1
  volumes:
    - clickhouse_data:/var/lib/clickhouse
  networks:
    - vector_net
prometheus
prometheus:
  image: prom/prometheus:latest
  ports:
    - "9090:9090"
  volumes:
    - ./prometheus.yml:/etc/prometheus/prometheus.yml
    - prometheus_data:/prometheus
  networks:
    - vector_net
grafana
grafana:
  image: grafana/grafana:latest
  ports:
    - "3000:3000"
  environment:
    - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
  volumes:
    - grafana_data:/var/lib/grafana
  depends_on:
    - clickhouse
    - prometheus
  networks:
    - vector_net

Quick Start

terminal
1# 1. Create the compose file
2cat > docker-compose.yml << 'EOF'
3services:
4 vector:
5 image: timberio/vector:latest-alpine
6 ports:
7 - "8686:8686"
8 - "9598:9598"
9 - "514:514/udp"
10 - "6514:6514"
11 volumes:
12 - ./vector.toml:/etc/vector/vector.toml:ro
13 - vector_data:/var/lib/vector
14 networks:
15 - vector_net
16
17 clickhouse:
18 image: clickhouse/clickhouse-server:latest
19 ports:
20 - "8123:8123"
21 - "9000:9000"
22 environment:
23 - CLICKHOUSE_USER=${CLICKHOUSE_USER}
24 - CLICKHOUSE_PASSWORD=${CLICKHOUSE_PASSWORD}
25 - CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=1
26 volumes:
27 - clickhouse_data:/var/lib/clickhouse
28 networks:
29 - vector_net
30
31 prometheus:
32 image: prom/prometheus:latest
33 ports:
34 - "9090:9090"
35 volumes:
36 - ./prometheus.yml:/etc/prometheus/prometheus.yml
37 - prometheus_data:/prometheus
38 networks:
39 - vector_net
40
41 grafana:
42 image: grafana/grafana:latest
43 ports:
44 - "3000:3000"
45 environment:
46 - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
47 volumes:
48 - grafana_data:/var/lib/grafana
49 depends_on:
50 - clickhouse
51 - prometheus
52 networks:
53 - vector_net
54
55volumes:
56 vector_data:
57 clickhouse_data:
58 prometheus_data:
59 grafana_data:
60
61networks:
62 vector_net:
63EOF
64
65# 2. Create the .env file
66cat > .env << 'EOF'
67# Vector Pipeline
68CLICKHOUSE_USER=default
69CLICKHOUSE_PASSWORD=secure_clickhouse_password
70GRAFANA_PASSWORD=secure_grafana_password
71
72# Vector API at http://localhost:8686
73# ClickHouse at http://localhost:8123
74EOF
75
76# 3. Start the services
77docker compose up -d
78
79# 4. View logs
80docker compose logs -f

One-Liner

Run this command to download and set up the recipe in one step:

terminal
1curl -fsSL https://docker.recipes/api/recipes/vector-aggregator/run | bash

Troubleshooting

  • Vector buffer full errors: Increase buffer size in vector.toml or check ClickHouse write performance with SHOW PROCESSLIST
  • ClickHouse connection refused: Verify CLICKHOUSE_USER and CLICKHOUSE_PASSWORD environment variables match container settings
  • Vector transformation parsing errors: Enable Vector debug logging and validate VRL syntax using vector validate command
  • Grafana ClickHouse queries timing out: Add appropriate indexes on timestamp and frequently queried columns in ClickHouse
  • High memory usage in Vector: Reduce buffer.max_events setting and enable compression in sink configurations
  • Missing logs in ClickHouse: Check Vector logs for sink errors and verify ClickHouse table schema matches Vector output format

Community Notes

Loading...
Loading notes...

Download Recipe Kit

Get all files in a ready-to-deploy package

Includes docker-compose.yml, .env template, README, and license

Ad Space