StarRocks Cluster Architecture (3 FE + 3 BE in One Docker Compose)
Overview
This architecture deploys a highly available StarRocks cluster using a single Docker Compose file. The cluster consists of:
- 3 Frontend (FE) Nodes
- 3 Backend (BE) Nodes
- 1 Docker Compose Deployment
The Frontend nodes provide metadata management, query parsing, optimization, and cluster coordination, while Backend nodes handle distributed storage and query execution.
This setup is suitable for:
- High Availability (HA)
- Real-time Analytics
- Data Warehousing
- Business Intelligence (BI)
- Large-scale OLAP workloads
Architecture Illustration
Cluster Components
Frontend Layer (3 FE Nodes)
The Frontend layer manages the cluster.
FE-1 (Leader)
Responsibilities:
- Metadata Management
- Query Parsing
- Query Optimization
- Cluster Coordination
- User Authentication
FE-2 (Follower)
Responsibilities:
- Metadata Replication
- Automatic Failover Support
- Leader Election Participation
FE-3 (Follower)
Responsibilities:
- Metadata Replication
- High Availability Support
- Cluster Monitoring
Benefits of multiple FE nodes:
✅ No single point of failure
✅ Automatic leader election
✅ Metadata redundancy
✅ Higher cluster availability
Backend Layer (3 BE Nodes)
The Backend layer performs storage and computation.
BE-1
Responsibilities:
- Data Storage
- Query Execution
- Aggregation Processing
BE-2
Responsibilities:
- Data Replication
- Distributed Computation
- Storage Services
BE-3
Responsibilities:
- Parallel Query Processing
- Data Replication
- Result Generation
Benefits:
✅ Horizontal scalability
✅ Parallel execution
✅ Fault tolerance
✅ High-performance analytics
Data Flow
Client Applications
│
▼
┌─────────────────┐
│ FE Leader Node │
└─────────────────┘
│
▼
Query Planning & Optimization
│
▼
┌────────┬────────┬────────┐
│ BE-1 │ BE-2 │ BE-3 │
└────────┴────────┴────────┘
│
▼
Distributed Processing
│
▼
Query Results
│
▼
Client
Docker Compose Deployment
A single Docker Compose file manages:
services:
fe1:
fe2:
fe3:
be1:
be2:
be3:
Benefits:
- Simplified deployment
- Centralized configuration
- Easy maintenance
- Consistent environments
- Rapid cluster startup
Start the cluster:
docker compose up -d
Check containers:
docker ps
Query Execution Workflow
Step 1
A client connects using MySQL protocol:
mysql -h fe1 -P 9030 -u root
Step 2
The FE Leader receives the SQL query.
Step 3
The query is parsed and optimized.
Step 4
The execution plan is distributed to:
- BE-1
- BE-2
- BE-3
Step 5
Backend nodes process data in parallel.
Step 6
Results are aggregated and returned to the client.
High Availability Features
Frontend High Availability
FE-1 (Leader)
│
┌───┴───┐
│ │
FE-2 FE-3
(Follower)
If FE-1 fails:
- FE-2 or FE-3 becomes the new leader.
- Metadata remains available.
- Queries continue with minimal interruption.
Backend Fault Tolerance
Data Replica A
├── BE-1
├── BE-2
└── BE-3
If one BE node fails:
- Replicas on remaining nodes continue serving queries.
- Data availability is maintained.
Typical Ports
| Component | Port | Purpose |
|---|---|---|
| FE | 9030 | MySQL Protocol |
| FE | 8030 | Web UI |
| FE | 9010 | RPC |
| BE | 8040 | HTTP Service |
| BE | 9060 | BE Service |
Key Advantages
✅ High Availability (3 FE)
✅ Distributed Storage (3 BE)
✅ MySQL Compatible
✅ Massive Parallel Processing (MPP)
✅ Real-time Analytics
✅ Horizontal Scalability
✅ Fault Tolerance
✅ Easy Deployment with Docker Compose
