Kubernetes Architecture

Control Plane Worker Nodes etcd Cloud Providers

A comprehensive deep dive into Kubernetes architecture covering control plane components, worker nodes, etcd, API server, scheduler, controller manager, and cloud provider integrations. Understand how Kubernetes orchestrates containers at scale.

Kubernetes Control Plane Worker Nodes

What is Kubernetes Architecture?

Kubernetes follows a master-worker architecture where the cluster is divided into two main parts: the Control Plane (master) and Worker Nodes. The Control Plane manages the cluster's state and orchestrates workloads, while Worker Nodes run the actual containerized applications.

All components communicate via the API Server, which serves as the central hub for all cluster operations. This architecture ensures high availability, scalability, and resilience, making Kubernetes suitable for production-grade container orchestration.

High-Level Architecture

Control Plane ↔ API Server ↔ Worker Nodes

Control Plane → API Server → Worker Nodes

Key Principle: Kubernetes uses a declarative model. You declare the desired state (e.g., "run 3 replicas of nginx"), and Kubernetes continuously reconciles the actual state to match your declaration. This is the foundation of self-healing and automated operations.

Control Plane Components

The Control Plane is the brain of the Kubernetes cluster. It makes global decisions about the cluster (scheduling, scaling, etc.) and detects and responds to cluster events. In production environments, the Control Plane is typically run on multiple nodes for high availability.

API Server (kube-apiserver)

Role: The central hub for all cluster operations

Exposes the Kubernetes API, validates and processes requests, and is the only component that communicates with etcd. All internal and external communication goes through the API Server.

etcd

Role: Distributed key-value store for cluster state

Stores all cluster data including configuration, secrets, and state. Consistent and highly available. All cluster changes are recorded in etcd.

Scheduler (kube-scheduler)

Role: Assigns pods to worker nodes

Watches for newly created pods with no assigned node and selects the best node based on resource requirements, affinity, taints/tolerations, and other constraints.

Controller Manager (kube-controller-manager)

Role: Runs controller processes

Includes controllers like Node Controller, Replication Controller, Endpoints Controller, and Service Account Controller. Each controller watches the cluster state and makes changes to move towards the desired state.

Cloud Controller Manager

Role: Cloud provider integration

Bridges Kubernetes with cloud provider APIs for load balancers, storage volumes, and node management. Only required when running on a cloud provider like AWS, GCP, or Azure.

                # Check control plane component status
kubectl get componentstatuses

# View control plane pods in kube-system namespace
kubectl get pods -n kube-system

# Check API server health
kubectl cluster-info

# View leader election status
kubectl get leases -n kube-system
            

Worker Node Components

Worker nodes are the machines where containerized applications run. Each worker node contains the necessary services to run pods, report status, and handle communication with the Control Plane.

Kubelet

Role: Primary node agent

Ensures containers are running in a pod. Communicates with the API Server to receive pod specifications and reports node status. Responsible for container lifecycle management.

Container Runtime

Role: Runs containers

The container runtime (containerd, CRI-O, Docker) pulls images, creates and runs containers. Must be CRI (Container Runtime Interface) compliant.

CNI Plugins

Role: Container networking

Implements the Container Network Interface (CNI) specification, providing network connectivity to pods. Examples: Calico, Flannel, Weave, Cilium.

kube-proxy

Role: Network proxy and load balancer

Maintains network rules on nodes. Implements service abstraction by forwarding traffic to the correct pods. Supports iptables, IPVS, and userspace modes.

                # Check node status
kubectl get nodes

# View node details
kubectl describe node <node-name>

# Check kubelet logs
journalctl -u kubelet -f

# View node capacity and allocatable resources
kubectl describe node <node-name> | grep -A 5 "Capacity"
            

Control Plane Communication Flow

Understanding how components communicate is essential for debugging and designing resilient clusters.

Request Flow

1. kubectl → API Server - User sends command via kubectl or API client

2. API Server → etcd - Validation, authentication, and persistence to etcd

3. Controller Manager ← API Server - Watches for changes and reconciles state

4. Scheduler ← API Server - Watches for unscheduled pods

5. Scheduler → API Server - Assigns node and updates pod spec

6. Kubelet ← API Server - Receives pod assignments and reports status

7. Kubelet → Container Runtime - Pulls images and starts containers

Security Note: All control plane-to-node communication is secured via TLS certificates. The API Server is the only component that directly communicates with etcd, providing a security boundary.

etcd: The Cluster State Store

etcd is a distributed, reliable key-value store that serves as Kubernetes' backing store for all cluster data. It's a critical component that requires special attention in production.

Consistency

etcd uses the Raft consensus algorithm to maintain consistency across nodes. All changes go through the leader, ensuring a single source of truth.

Performance

etcd is optimized for low latency and high throughput. Uses bbolt for storage and supports watch for real-time change notifications.

High Availability

For production, run etcd in a cluster of 3 or 5 nodes. This provides fault tolerance while maintaining quorum for writes.

Storage

etcd stores data in the default directory /var/lib/etcd/. Monitor disk I/O as etcd is I/O sensitive. Use SSD storage for production.

                # Check etcd health
etcdctl endpoint health --cluster

# Check etcd member list
etcdctl member list

# Backup etcd
ETCDCTL_API=3 etcdctl snapshot save snapshot.db

# Restore etcd
ETCDCTL_API=3 etcdctl snapshot restore snapshot.db --data-dir /var/lib/etcd-backup

# Check etcd metrics
curl http://localhost:2382/metrics
            

Critical: etcd is the single source of truth for your cluster. Always back up etcd regularly and monitor its health. Losing etcd data means losing your entire cluster state, including all configurations and secrets.

API Server: The Cluster Gateway

The kube-apiserver is the central management component of Kubernetes. It exposes the Kubernetes API and serves as the front-end for the control plane.

Authentication & Authorization

Supports multiple auth methods: client certificates, bearer tokens, OIDC, webhook. Authorization uses RBAC, ABAC, or webhook to determine permissions.

Admission Control

Admission controllers intercept requests after authorization. They can mutate (MutatingAdmissionWebhook) or validate (ValidatingAdmissionWebhook) resources before they're persisted.

API Aggregation

Supports extending the API with custom resources (CRDs) and API aggregation. This allows building platform-specific extensions without modifying core Kubernetes.

Rate Limiting

Implements rate limiting to protect the cluster from abuse. Uses API priority and fairness to ensure critical requests are served even under load.

                # Check API server flags
kubectl get pod kube-apiserver-<id> -n kube-system -o yaml

# Check API server logs
kubectl logs -n kube-system kube-apiserver-<id>

# Enable audit logging
# Add audit policy to API server config

# Check API server metrics
kubectl get --raw /metrics
            

Scheduler: The Pod Placement Engine

The kube-scheduler is responsible for assigning pods to nodes based on resource requirements, policies, and constraints. It's a critical component for workload distribution.

Filtering Phase

Filters nodes that can run the pod based on resource requests, node selectors, taints/tolerations, and affinity/anti-affinity rules.

Scoring Phase

Scores filtered nodes based on priority functions (resource balance, affinity, and custom scoring). The node with the highest score is selected.

Extensibility

Scheduler can be extended with custom plugins for filtering and scoring. This enables custom scheduling policies for specific workloads.

Scheduling Policies

Supports preemption (evict lower-priority pods) and pod priority. Can be configured with custom scheduler profiles.

                # Check scheduler logs
kubectl logs -n kube-system kube-scheduler-<id>

# View scheduling events
kubectl get events --field-selector involvedObject.name=<pod-name>

# Check scheduler configuration
kubectl get configmap scheduler-config -n kube-system -o yaml

# Force pod rescheduling
kubectl delete pod <pod-name>
            

Cloud Provider Integration

Kubernetes integrates with cloud providers (AWS, GCP, Azure, etc.) for infrastructure services like load balancers, persistent volumes, and node management.

Cloud Controller Manager

The cloud-controller-manager runs cloud-specific controllers. It allows Kubernetes to interact with cloud provider APIs for provisioning resources.

Storage Provisioning

Cloud providers offer CSI (Container Storage Interface) drivers for dynamic provisioning of persistent volumes. Examples: EBS, GCE PD, Azure Disk.

Load Balancers

Service type LoadBalancer provisions cloud load balancers (AWS ELB, GCP L7, Azure LB) to expose services externally.

Node Management

Cloud providers handle node lifecycle: adding nodes to the cluster, detecting node failures, and managing node labels and taints.

                # Check cloud provider configuration
kubectl get configmap cloud-config -n kube-system -o yaml

# View service with load balancer
kubectl get svc

# Check storage classes
kubectl get storageclass

# View cloud provider metrics
kubectl get --raw /metrics | grep cloud
            

High Availability Architecture

Production Kubernetes clusters require high availability (HA) to ensure cluster resilience during failures. HA clusters run multiple control plane nodes and distribute workloads across worker nodes.

Multi-Master Control Plane

Run 3 or 5 control plane nodes for HA. Use a load balancer in front of the API Servers. etcd runs on separate nodes or co-located.

etcd Cluster

etcd runs as a cluster (3 or 5 nodes) for consensus. Ensure etcd nodes are on separate machines and networks for fault isolation.

Network Redundancy

Use redundant network paths, load balancers, and multi-AZ deployments to prevent single points of failure.

Disaster Recovery

Implement etcd backup and restore procedures. Document cluster recovery procedures and test them regularly.

                # Check etcd cluster health
etcdctl endpoint health --cluster

# Check control plane status
kubectl get componentstatuses

# View leader election
kubectl get leases -n kube-system

# Test failover by draining a control plane node
kubectl drain <control-plane-node> --ignore-daemonsets
            

Frequently Asked Questions

What is the difference between Control Plane and Worker Nodes?

Control Plane nodes manage the cluster state and orchestrate workloads. They run components like API Server, etcd, Scheduler, and Controller Manager. Worker nodes run the actual application workloads and communicate with the Control Plane via the API Server.

Can I run Kubernetes without etcd?

No. etcd is the core state store for Kubernetes. All cluster state is stored in etcd, making it mandatory for any Kubernetes cluster.

What happens if the API Server goes down?

If the API Server is unavailable, users can't interact with the cluster, and controllers can't make changes. However, existing workloads continue running on worker nodes. This is why HA with multiple API Server replicas is critical.

How many control plane nodes do I need for HA?

For high availability, you need at least 3 control plane nodes (odd number). This ensures quorum for etcd and provides fault tolerance. 5 nodes is recommended for larger clusters.

What's the role of the Cloud Controller Manager?

The Cloud Controller Manager integrates Kubernetes with cloud provider APIs. It manages cloud resources like load balancers, persistent volumes, and node lifecycles, allowing Kubernetes to leverage cloud infrastructure.

How does the scheduler decide where to place pods?

The scheduler uses a two-phase process: filtering (finds nodes that can run the pod) and scoring (ranks filtered nodes). It considers resource requests, node affinity/anti-affinity, taints/tolerations, and custom scheduling policies.

What is etcd's role in the Kubernetes cluster?

etcd is the distributed key-value store that holds all cluster state. It stores configuration, secrets, and the status of all resources. The API Server reads from and writes to etcd, making it the source of truth.

Can I use a different container runtime with Kubernetes?

Yes, Kubernetes supports any container runtime that implements the CRI (Container Runtime Interface). Common options include containerd (default), CRI-O, and Docker (via cri-dockerd).

Back: Kubernetes Production Patterns Next: Kubernetes Objects

Understanding Kubernetes architecture is the foundation for building and operating production-grade clusters. Master these concepts to design resilient, scalable, and secure container orchestration platforms.