Vector Log Aggregator Pipeline
High-performance observability pipeline for logs, metrics, and traces.
Overview
Vector is a high-performance observability data pipeline built in Rust by Datadog that efficiently collects, transforms, and routes logs, metrics, and traces. Originally developed to address the performance limitations and memory overhead of traditional log aggregators like Fluentd and Logstash, Vector can process millions of events per second with minimal resource consumption. This stack combines Vector's powerful data processing capabilities with ClickHouse's columnar storage engine for analytics, Prometheus for metrics collection, and Grafana for visualization. The combination creates a complete observability pipeline where Vector acts as the central data router, ingesting logs from multiple sources, transforming and enriching the data, then distributing it to ClickHouse for long-term storage and analysis while exposing its own operational metrics to Prometheus. This architecture is particularly valuable for organizations that need to process large volumes of observability data while maintaining control over their data pipeline, costs, and performance. The stack offers enterprise-grade observability capabilities without the recurring costs of SaaS solutions, making it ideal for companies with significant log volumes, compliance requirements, or those seeking to build custom analytics on their observability data.
Key Features
- Vector's memory-mapped buffer system ensures data durability and prevents data loss during restarts or failures
- ClickHouse's columnar storage provides sub-second query performance on billions of log entries with automatic data compression
- Vector's VRL (Vector Remap Language) enables complex log parsing, enrichment, and transformation without external dependencies
- Prometheus integration captures Vector's internal metrics including throughput, error rates, and buffer utilization
- ClickHouse materialized views automatically aggregate log data for faster dashboard queries and reporting
- Vector's adaptive concurrency automatically optimizes sink throughput based on downstream system performance
- Grafana's ClickHouse data source plugin enables SQL-based log exploration with time-series visualization
- Vector's topology health checks monitor sink connectivity and automatically buffer data during outages
Common Use Cases
- 1High-volume application log aggregation for e-commerce platforms processing millions of transactions daily
- 2Security operations center (SOC) log analysis pipeline for threat detection and incident response
- 3Multi-tenant SaaS platform centralized logging with per-tenant data isolation and cost tracking
- 4Kubernetes cluster observability with pod log collection, metric extraction, and distributed tracing correlation
- 5Compliance logging for financial services requiring tamper-proof audit trails and long-term retention
- 6Real-time fraud detection system processing payment logs with complex pattern matching and alerting
- 7DevOps pipeline monitoring aggregating CI/CD logs, deployment metrics, and infrastructure health data
Prerequisites
- Minimum 8GB RAM recommended (2GB for ClickHouse, 1GB for Prometheus, 512MB for Grafana, remainder for Vector buffers)
- Basic understanding of TOML configuration syntax for Vector pipeline definitions
- Familiarity with PromQL for creating custom metrics dashboards and alerts
- Knowledge of ClickHouse SQL dialect differences from standard SQL for log analysis queries
- Available ports 3000, 8123, 8686, 9000, 9090, and 514/6514 for service communication
- Understanding of log formats (JSON, syslog, CEF) that will be ingested by Vector
For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms
docker-compose.yml
docker-compose.yml
1services: 2 vector: 3 image: timberio/vector:latest-alpine4 ports: 5 - "8686:8686"6 - "9598:9598"7 - "514:514/udp"8 - "6514:6514"9 volumes: 10 - ./vector.toml:/etc/vector/vector.toml:ro11 - vector_data:/var/lib/vector12 networks: 13 - vector_net1415 clickhouse: 16 image: clickhouse/clickhouse-server:latest17 ports: 18 - "8123:8123"19 - "9000:9000"20 environment: 21 - CLICKHOUSE_USER=${CLICKHOUSE_USER}22 - CLICKHOUSE_PASSWORD=${CLICKHOUSE_PASSWORD}23 - CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=124 volumes: 25 - clickhouse_data:/var/lib/clickhouse26 networks: 27 - vector_net2829 prometheus: 30 image: prom/prometheus:latest31 ports: 32 - "9090:9090"33 volumes: 34 - ./prometheus.yml:/etc/prometheus/prometheus.yml35 - prometheus_data:/prometheus36 networks: 37 - vector_net3839 grafana: 40 image: grafana/grafana:latest41 ports: 42 - "3000:3000"43 environment: 44 - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}45 volumes: 46 - grafana_data:/var/lib/grafana47 depends_on: 48 - clickhouse49 - prometheus50 networks: 51 - vector_net5253volumes: 54 vector_data: 55 clickhouse_data: 56 prometheus_data: 57 grafana_data: 5859networks: 60 vector_net: .env Template
.env
1# Vector Pipeline2CLICKHOUSE_USER=default3CLICKHOUSE_PASSWORD=secure_clickhouse_password4GRAFANA_PASSWORD=secure_grafana_password56# Vector API at http://localhost:86867# ClickHouse at http://localhost:8123Usage Notes
- 1Vector API at http://localhost:8686
- 2Syslog input at port 514/UDP
- 3ClickHouse for log storage
- 4Configure vector.toml for pipelines
- 5Written in Rust for performance
Individual Services(4 services)
Copy individual services to mix and match with your existing compose files.
vector
vector:
image: timberio/vector:latest-alpine
ports:
- "8686:8686"
- "9598:9598"
- 514:514/udp
- "6514:6514"
volumes:
- ./vector.toml:/etc/vector/vector.toml:ro
- vector_data:/var/lib/vector
networks:
- vector_net
clickhouse
clickhouse:
image: clickhouse/clickhouse-server:latest
ports:
- "8123:8123"
- "9000:9000"
environment:
- CLICKHOUSE_USER=${CLICKHOUSE_USER}
- CLICKHOUSE_PASSWORD=${CLICKHOUSE_PASSWORD}
- CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=1
volumes:
- clickhouse_data:/var/lib/clickhouse
networks:
- vector_net
prometheus
prometheus:
image: prom/prometheus:latest
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
networks:
- vector_net
grafana
grafana:
image: grafana/grafana:latest
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
volumes:
- grafana_data:/var/lib/grafana
depends_on:
- clickhouse
- prometheus
networks:
- vector_net
Quick Start
terminal
1# 1. Create the compose file2cat > docker-compose.yml << 'EOF'3services:4 vector:5 image: timberio/vector:latest-alpine6 ports:7 - "8686:8686"8 - "9598:9598"9 - "514:514/udp"10 - "6514:6514"11 volumes:12 - ./vector.toml:/etc/vector/vector.toml:ro13 - vector_data:/var/lib/vector14 networks:15 - vector_net1617 clickhouse:18 image: clickhouse/clickhouse-server:latest19 ports:20 - "8123:8123"21 - "9000:9000"22 environment:23 - CLICKHOUSE_USER=${CLICKHOUSE_USER}24 - CLICKHOUSE_PASSWORD=${CLICKHOUSE_PASSWORD}25 - CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=126 volumes:27 - clickhouse_data:/var/lib/clickhouse28 networks:29 - vector_net3031 prometheus:32 image: prom/prometheus:latest33 ports:34 - "9090:9090"35 volumes:36 - ./prometheus.yml:/etc/prometheus/prometheus.yml37 - prometheus_data:/prometheus38 networks:39 - vector_net4041 grafana:42 image: grafana/grafana:latest43 ports:44 - "3000:3000"45 environment:46 - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}47 volumes:48 - grafana_data:/var/lib/grafana49 depends_on:50 - clickhouse51 - prometheus52 networks:53 - vector_net5455volumes:56 vector_data:57 clickhouse_data:58 prometheus_data:59 grafana_data:6061networks:62 vector_net:63EOF6465# 2. Create the .env file66cat > .env << 'EOF'67# Vector Pipeline68CLICKHOUSE_USER=default69CLICKHOUSE_PASSWORD=secure_clickhouse_password70GRAFANA_PASSWORD=secure_grafana_password7172# Vector API at http://localhost:868673# ClickHouse at http://localhost:812374EOF7576# 3. Start the services77docker compose up -d7879# 4. View logs80docker compose logs -fOne-Liner
Run this command to download and set up the recipe in one step:
terminal
1curl -fsSL https://docker.recipes/api/recipes/vector-aggregator/run | bashTroubleshooting
- Vector buffer full errors: Increase buffer size in vector.toml or check ClickHouse write performance with SHOW PROCESSLIST
- ClickHouse connection refused: Verify CLICKHOUSE_USER and CLICKHOUSE_PASSWORD environment variables match container settings
- Vector transformation parsing errors: Enable Vector debug logging and validate VRL syntax using vector validate command
- Grafana ClickHouse queries timing out: Add appropriate indexes on timestamp and frequently queried columns in ClickHouse
- High memory usage in Vector: Reduce buffer.max_events setting and enable compression in sink configurations
- Missing logs in ClickHouse: Check Vector logs for sink errors and verify ClickHouse table schema matches Vector output format
Community Notes
Loading...
Loading notes...
Download Recipe Kit
Get all files in a ready-to-deploy package
Includes docker-compose.yml, .env template, README, and license
Components
vectorclickhousegrafanaprometheus
Tags
#vector#datadog#logs#metrics#pipeline
Category
Monitoring & ObservabilityAd Space
Shortcuts: C CopyF FavoriteD Download