docker.recipes

Vector Log Pipeline

intermediate

High-performance observability data pipeline for logs, metrics, and traces.

Overview

Vector is a high-performance, Rust-based observability data pipeline that transforms, routes, and ships logs, metrics, and traces at scale. Originally developed by Timber.io (now part of Datadog), Vector addresses the complexity and performance limitations of traditional log processing tools by providing a unified platform that can handle millions of events per second with minimal resource overhead. Its vendor-neutral design and rich transformation capabilities make it ideal for organizations seeking to standardize their observability infrastructure without vendor lock-in. This stack combines Vector's powerful data processing capabilities with ClickHouse's columnar analytics engine and Grafana's visualization platform to create a complete observability solution. Vector ingests logs from multiple sources, applies transformations and enrichments, then streams the processed data to ClickHouse for storage and analysis. ClickHouse's column-oriented architecture excels at analytical queries over large log datasets, providing sub-second query performance even with billions of events, while Grafana creates interactive dashboards and alerts from the stored data. This combination is particularly valuable for engineering teams and SREs who need real-time log analytics without the cost and complexity of managed solutions like Splunk or Elastic Cloud. The stack supports high-throughput environments where traditional ELK stacks struggle with performance, and its SQL-based querying in ClickHouse makes it accessible to teams already familiar with relational databases. Organizations processing hundreds of GB to TB of logs daily will find this setup provides enterprise-grade capabilities at a fraction of the operational cost.

Key Features

  • Vector's topology-based configuration for complex data routing and transformation pipelines
  • Real-time log parsing with Vector's built-in VRL (Vector Remap Language) for field extraction and enrichment
  • ClickHouse's MergeTree engine optimized for time-series log data with automatic partitioning
  • Sub-second analytical queries on billions of log events using ClickHouse's columnar storage
  • Native ClickHouse data source integration in Grafana for log analytics dashboards
  • Vector's built-in metrics and health endpoints for pipeline observability
  • ClickHouse materialized views for pre-aggregated log metrics and faster dashboard queries
  • Grafana Explore interface for ad-hoc log querying using SQL syntax

Common Use Cases

  • 1High-volume application log analytics for microservices architectures generating 100GB+ daily
  • 2Security information and event management (SIEM) with real-time threat detection queries
  • 3Infrastructure monitoring for Kubernetes clusters with container log aggregation and analysis
  • 4Cost-effective replacement for Splunk or Elastic Cloud in enterprise environments
  • 5Real-time business intelligence dashboards based on application event logs
  • 6Compliance logging with long-term retention and fast audit queries
  • 7DevOps troubleshooting with correlated logs, metrics, and traces in a single interface

Prerequisites

  • Docker host with minimum 4GB RAM (8GB+ recommended for production workloads)
  • Available ports 3000, 8123, 8686, 9000, and 9598 for service access
  • Basic understanding of log formats and parsing requirements for your data sources
  • Vector configuration file (vector.toml) defining sources, transforms, and sinks
  • ClickHouse users configuration file for authentication and access control
  • Sufficient disk space for log retention (ClickHouse achieves 10:1 compression ratios typically)

For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms

docker-compose.yml

docker-compose.yml
1services:
2 vector:
3 image: timberio/vector:latest-alpine
4 container_name: vector
5 volumes:
6 - ./vector.toml:/etc/vector/vector.toml:ro
7 - /var/run/docker.sock:/var/run/docker.sock:ro
8 - vector-data:/var/lib/vector
9 ports:
10 - "8686:8686"
11 - "9598:9598"
12 networks:
13 - vector-network
14 restart: unless-stopped
15
16 clickhouse:
17 image: clickhouse/clickhouse-server:latest
18 container_name: vector-clickhouse
19 volumes:
20 - clickhouse-data:/var/lib/clickhouse
21 - ./clickhouse-users.xml:/etc/clickhouse-server/users.d/users.xml:ro
22 ports:
23 - "8123:8123"
24 - "9000:9000"
25 networks:
26 - vector-network
27 restart: unless-stopped
28
29 grafana:
30 image: grafana/grafana:latest
31 container_name: vector-grafana
32 environment:
33 - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
34 - GF_INSTALL_PLUGINS=grafana-clickhouse-datasource
35 volumes:
36 - grafana-data:/var/lib/grafana
37 ports:
38 - "3000:3000"
39 networks:
40 - vector-network
41 restart: unless-stopped
42
43volumes:
44 vector-data:
45 clickhouse-data:
46 grafana-data:
47
48networks:
49 vector-network:
50 driver: bridge

.env Template

.env
1# Vector
2GRAFANA_PASSWORD=secure_grafana_password
3
4# Create vector.toml configuration file

Usage Notes

  1. 1Vector API at http://localhost:8686
  2. 2Metrics at :9598
  3. 3ClickHouse at :8123
  4. 4Grafana at http://localhost:3000
  5. 5High-performance Rust-based pipeline

Individual Services(3 services)

Copy individual services to mix and match with your existing compose files.

vector
vector:
  image: timberio/vector:latest-alpine
  container_name: vector
  volumes:
    - ./vector.toml:/etc/vector/vector.toml:ro
    - /var/run/docker.sock:/var/run/docker.sock:ro
    - vector-data:/var/lib/vector
  ports:
    - "8686:8686"
    - "9598:9598"
  networks:
    - vector-network
  restart: unless-stopped
clickhouse
clickhouse:
  image: clickhouse/clickhouse-server:latest
  container_name: vector-clickhouse
  volumes:
    - clickhouse-data:/var/lib/clickhouse
    - ./clickhouse-users.xml:/etc/clickhouse-server/users.d/users.xml:ro
  ports:
    - "8123:8123"
    - "9000:9000"
  networks:
    - vector-network
  restart: unless-stopped
grafana
grafana:
  image: grafana/grafana:latest
  container_name: vector-grafana
  environment:
    - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
    - GF_INSTALL_PLUGINS=grafana-clickhouse-datasource
  volumes:
    - grafana-data:/var/lib/grafana
  ports:
    - "3000:3000"
  networks:
    - vector-network
  restart: unless-stopped

Quick Start

terminal
1# 1. Create the compose file
2cat > docker-compose.yml << 'EOF'
3services:
4 vector:
5 image: timberio/vector:latest-alpine
6 container_name: vector
7 volumes:
8 - ./vector.toml:/etc/vector/vector.toml:ro
9 - /var/run/docker.sock:/var/run/docker.sock:ro
10 - vector-data:/var/lib/vector
11 ports:
12 - "8686:8686"
13 - "9598:9598"
14 networks:
15 - vector-network
16 restart: unless-stopped
17
18 clickhouse:
19 image: clickhouse/clickhouse-server:latest
20 container_name: vector-clickhouse
21 volumes:
22 - clickhouse-data:/var/lib/clickhouse
23 - ./clickhouse-users.xml:/etc/clickhouse-server/users.d/users.xml:ro
24 ports:
25 - "8123:8123"
26 - "9000:9000"
27 networks:
28 - vector-network
29 restart: unless-stopped
30
31 grafana:
32 image: grafana/grafana:latest
33 container_name: vector-grafana
34 environment:
35 - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
36 - GF_INSTALL_PLUGINS=grafana-clickhouse-datasource
37 volumes:
38 - grafana-data:/var/lib/grafana
39 ports:
40 - "3000:3000"
41 networks:
42 - vector-network
43 restart: unless-stopped
44
45volumes:
46 vector-data:
47 clickhouse-data:
48 grafana-data:
49
50networks:
51 vector-network:
52 driver: bridge
53EOF
54
55# 2. Create the .env file
56cat > .env << 'EOF'
57# Vector
58GRAFANA_PASSWORD=secure_grafana_password
59
60# Create vector.toml configuration file
61EOF
62
63# 3. Start the services
64docker compose up -d
65
66# 4. View logs
67docker compose logs -f

One-Liner

Run this command to download and set up the recipe in one step:

terminal
1curl -fsSL https://docker.recipes/api/recipes/vector-logs/run | bash

Troubleshooting

  • Vector pipeline stalled with no data flowing: Check vector.toml syntax and verify source connectivity using vector validate command
  • ClickHouse connection refused on port 9000: Ensure ClickHouse container is healthy and users.xml contains proper authentication settings
  • Grafana ClickHouse plugin installation fails: Restart Grafana container after plugin installation or use pre-built image with plugin
  • High memory usage in ClickHouse: Tune max_memory_usage settings and verify proper partitioning keys in table schemas
  • Vector backpressure warnings: Increase buffer sizes in vector.toml or optimize ClickHouse insert performance with batch settings
  • Dashboard queries timing out in Grafana: Add proper indexes to ClickHouse tables and use time-based query filters

Community Notes

Loading...
Loading notes...

Download Recipe Kit

Get all files in a ready-to-deploy package

Includes docker-compose.yml, .env template, README, and license

Ad Space