Vector Log Pipeline
High-performance observability pipeline with Vector, ClickHouse storage, and Grafana.
Overview
Vector is a high-performance observability data pipeline developed by Datadog that transforms, routes, and processes logs, metrics, and traces at scale. Originally created to address the limitations of traditional log processing tools like Logstash and Fluentd, Vector provides better performance, memory safety through Rust, and more flexible data transformation capabilities. This pipeline configuration creates a complete log analytics solution where Vector collects and processes log data from multiple sources, stores it in ClickHouse's columnar database for fast analytical queries, and visualizes the results through Grafana dashboards. ClickHouse serves as the storage backend, providing sub-second query performance on billions of log entries through its column-oriented architecture and aggressive compression. The combination eliminates the need for expensive proprietary logging solutions while delivering enterprise-grade performance for log analytics, real-time monitoring, and operational insights. This stack particularly excels for organizations processing high-volume log data who need both real-time ingestion capabilities and fast historical analysis. Vector's ability to parse, filter, and transform logs in-flight combined with ClickHouse's analytical performance creates a powerful alternative to ELK stack deployments, especially when SQL-based log analysis is preferred over Elasticsearch's query DSL.
Key Features
- Vector's high-performance log ingestion with support for 50+ data sources including syslog, Docker, Kubernetes, and file tailing
- Real-time log transformation and filtering using Vector Remap Language (VRL) for field extraction and data enrichment
- ClickHouse columnar storage with 10-100x faster analytical queries compared to traditional row-based log storage
- Grafana ClickHouse data source integration for SQL-based dashboard creation and log exploration
- Vector's backpressure handling and automatic retries ensuring reliable log delivery to ClickHouse
- ClickHouse MergeTree engine optimized for time-series log data with automatic partitioning and TTL policies
- NGINX reverse proxy providing unified access point and load balancing for the entire observability stack
- Vector's built-in metrics and health endpoints for monitoring the pipeline performance itself
Common Use Cases
- 1Application log analytics for debugging and performance monitoring in microservices architectures
- 2Infrastructure monitoring for server logs, system metrics, and container orchestration platforms
- 3Security information and event management (SIEM) with fast log correlation and threat detection queries
- 4Business intelligence dashboards combining application logs with operational metrics for product insights
- 5Compliance logging with long-term retention and fast audit trail queries using ClickHouse's compression
- 6DevOps teams replacing expensive Splunk or DataDog deployments with open-source log analytics
- 7Real-time alerting based on log patterns and anomaly detection using Grafana's alerting rules
Prerequisites
- Minimum 8GB RAM recommended for ClickHouse analytical workloads and log retention
- Docker host with sufficient disk I/O performance for high-volume log ingestion and ClickHouse storage
- Network access to log sources that Vector will collect from (ports 514 for syslog, Docker socket access)
- Basic understanding of SQL for creating ClickHouse schemas and Grafana dashboard queries
- Familiarity with Vector Remap Language (VRL) for log parsing and transformation configuration
- Available ports: 8686 (Vector API), 8123 (ClickHouse HTTP), 3000 (Grafana), 9000 (Vector sources)
For development & testing. Review security settings, change default credentials, and test thoroughly before production use. See Terms
docker-compose.yml
docker-compose.yml
1services: 2 vector: 3 image: timberio/vector:latest-alpine4 ports: 5 - "8686:8686"6 - "9000:9000"7 volumes: 8 - ./vector.toml:/etc/vector/vector.toml:ro9 - /var/log:/var/log:ro10 - /var/run/docker.sock:/var/run/docker.sock:ro11 depends_on: 12 - clickhouse13 networks: 14 - vector-net15 restart: unless-stopped1617 clickhouse: 18 image: clickhouse/clickhouse-server:latest19 ports: 20 - "8123:8123"21 - "9001:9000"22 volumes: 23 - clickhouse_data:/var/lib/clickhouse24 - ./clickhouse-config.xml:/etc/clickhouse-server/config.d/config.xml:ro25 environment: 26 CLICKHOUSE_DB: ${CLICKHOUSE_DB}27 CLICKHOUSE_USER: ${CLICKHOUSE_USER}28 CLICKHOUSE_PASSWORD: ${CLICKHOUSE_PASSWORD}29 ulimits: 30 nofile: 31 soft: 26214432 hard: 26214433 networks: 34 - vector-net35 restart: unless-stopped3637 grafana: 38 image: grafana/grafana:latest39 ports: 40 - "3000:3000"41 environment: 42 GF_SECURITY_ADMIN_PASSWORD: ${GRAFANA_PASSWORD}43 GF_INSTALL_PLUGINS: grafana-clickhouse-datasource44 volumes: 45 - grafana_data:/var/lib/grafana46 depends_on: 47 - clickhouse48 networks: 49 - vector-net50 restart: unless-stopped5152 nginx: 53 image: nginx:alpine54 ports: 55 - "80:80"56 volumes: 57 - ./nginx.conf:/etc/nginx/nginx.conf:ro58 depends_on: 59 - grafana60 networks: 61 - vector-net62 restart: unless-stopped6364volumes: 65 clickhouse_data: 66 grafana_data: 6768networks: 69 vector-net: 70 driver: bridge.env Template
.env
1# ClickHouse2CLICKHOUSE_DB=logs3CLICKHOUSE_USER=vector4CLICKHOUSE_PASSWORD=secure_clickhouse_password56# Grafana7GRAFANA_PASSWORD=secure_grafana_passwordUsage Notes
- 1Vector API at http://localhost:8686
- 2ClickHouse HTTP at http://localhost:8123
- 3Grafana dashboards at http://localhost:3000
- 4Configure vector.toml for sources and sinks
Individual Services(4 services)
Copy individual services to mix and match with your existing compose files.
vector
vector:
image: timberio/vector:latest-alpine
ports:
- "8686:8686"
- "9000:9000"
volumes:
- ./vector.toml:/etc/vector/vector.toml:ro
- /var/log:/var/log:ro
- /var/run/docker.sock:/var/run/docker.sock:ro
depends_on:
- clickhouse
networks:
- vector-net
restart: unless-stopped
clickhouse
clickhouse:
image: clickhouse/clickhouse-server:latest
ports:
- "8123:8123"
- "9001:9000"
volumes:
- clickhouse_data:/var/lib/clickhouse
- ./clickhouse-config.xml:/etc/clickhouse-server/config.d/config.xml:ro
environment:
CLICKHOUSE_DB: ${CLICKHOUSE_DB}
CLICKHOUSE_USER: ${CLICKHOUSE_USER}
CLICKHOUSE_PASSWORD: ${CLICKHOUSE_PASSWORD}
ulimits:
nofile:
soft: 262144
hard: 262144
networks:
- vector-net
restart: unless-stopped
grafana
grafana:
image: grafana/grafana:latest
ports:
- "3000:3000"
environment:
GF_SECURITY_ADMIN_PASSWORD: ${GRAFANA_PASSWORD}
GF_INSTALL_PLUGINS: grafana-clickhouse-datasource
volumes:
- grafana_data:/var/lib/grafana
depends_on:
- clickhouse
networks:
- vector-net
restart: unless-stopped
nginx
nginx:
image: nginx:alpine
ports:
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
depends_on:
- grafana
networks:
- vector-net
restart: unless-stopped
Quick Start
terminal
1# 1. Create the compose file2cat > docker-compose.yml << 'EOF'3services:4 vector:5 image: timberio/vector:latest-alpine6 ports:7 - "8686:8686"8 - "9000:9000"9 volumes:10 - ./vector.toml:/etc/vector/vector.toml:ro11 - /var/log:/var/log:ro12 - /var/run/docker.sock:/var/run/docker.sock:ro13 depends_on:14 - clickhouse15 networks:16 - vector-net17 restart: unless-stopped1819 clickhouse:20 image: clickhouse/clickhouse-server:latest21 ports:22 - "8123:8123"23 - "9001:9000"24 volumes:25 - clickhouse_data:/var/lib/clickhouse26 - ./clickhouse-config.xml:/etc/clickhouse-server/config.d/config.xml:ro27 environment:28 CLICKHOUSE_DB: ${CLICKHOUSE_DB}29 CLICKHOUSE_USER: ${CLICKHOUSE_USER}30 CLICKHOUSE_PASSWORD: ${CLICKHOUSE_PASSWORD}31 ulimits:32 nofile:33 soft: 26214434 hard: 26214435 networks:36 - vector-net37 restart: unless-stopped3839 grafana:40 image: grafana/grafana:latest41 ports:42 - "3000:3000"43 environment:44 GF_SECURITY_ADMIN_PASSWORD: ${GRAFANA_PASSWORD}45 GF_INSTALL_PLUGINS: grafana-clickhouse-datasource46 volumes:47 - grafana_data:/var/lib/grafana48 depends_on:49 - clickhouse50 networks:51 - vector-net52 restart: unless-stopped5354 nginx:55 image: nginx:alpine56 ports:57 - "80:80"58 volumes:59 - ./nginx.conf:/etc/nginx/nginx.conf:ro60 depends_on:61 - grafana62 networks:63 - vector-net64 restart: unless-stopped6566volumes:67 clickhouse_data:68 grafana_data:6970networks:71 vector-net:72 driver: bridge73EOF7475# 2. Create the .env file76cat > .env << 'EOF'77# ClickHouse78CLICKHOUSE_DB=logs79CLICKHOUSE_USER=vector80CLICKHOUSE_PASSWORD=secure_clickhouse_password8182# Grafana83GRAFANA_PASSWORD=secure_grafana_password84EOF8586# 3. Start the services87docker compose up -d8889# 4. View logs90docker compose logs -fOne-Liner
Run this command to download and set up the recipe in one step:
terminal
1curl -fsSL https://docker.recipes/api/recipes/vector-log-pipeline/run | bashTroubleshooting
- ClickHouse 'Memory limit exceeded' errors: Increase max_memory_usage setting or add more RAM to handle large analytical queries
- Vector pipeline lag or dropped logs: Check backpressure configuration and ClickHouse insert performance, tune batch sizes in vector.toml
- Grafana ClickHouse datasource connection failed: Verify ClickHouse HTTP interface is accessible on port 8123 and credentials are correct
- Vector sources not receiving logs: Check file permissions for /var/log access and Docker socket permissions for container logs
- ClickHouse out of disk space: Configure TTL policies for automatic log rotation and enable compression in MergeTree settings
- High Vector CPU usage during parsing: Optimize VRL transform functions and consider using native ClickHouse parsing functions instead
Community Notes
Loading...
Loading notes...
Download Recipe Kit
Get all files in a ready-to-deploy package
Includes docker-compose.yml, .env template, README, and license
Components
vectorclickhousegrafananginx
Tags
#vector#logging#clickhouse#pipeline#observability
Category
Monitoring & ObservabilityAd Space
Shortcuts: C CopyF FavoriteD Download