Kubernetes Architecture

A comprehensive deep dive into Kubernetes architecture covering control plane components, worker nodes, etcd, API server, scheduler, controller manager, and cloud provider integrations. Understand how Kubernetes orchestrates containers at scale.

Kubernetes Control Plane Worker Nodes
What is Kubernetes Architecture?

Kubernetes follows a master-worker architecture where the cluster is divided into two main parts: the Control Plane (master) and Worker Nodes. The Control Plane manages the cluster's state and orchestrates workloads, while Worker Nodes run the actual containerized applications.

All components communicate via the API Server, which serves as the central hub for all cluster operations. This architecture ensures high availability, scalability, and resilience, making Kubernetes suitable for production-grade container orchestration.

High-Level Architecture
Control Plane ↔ API Server ↔ Worker Nodes
Control Plane API Server Worker Nodes
Key Principle: Kubernetes uses a declarative model. You declare the desired state (e.g., "run 3 replicas of nginx"), and Kubernetes continuously reconciles the actual state to match your declaration. This is the foundation of self-healing and automated operations.
Control Plane Components

The Control Plane is the brain of the Kubernetes cluster. It makes global decisions about the cluster (scheduling, scaling, etc.) and detects and responds to cluster events. In production environments, the Control Plane is typically run on multiple nodes for high availability.

API Server (kube-apiserver)

Role: The central hub for all cluster operations
Exposes the Kubernetes API, validates and processes requests, and is the only component that communicates with etcd. All internal and external communication goes through the API Server.

etcd

Role: Distributed key-value store for cluster state
Stores all cluster data including configuration, secrets, and state. Consistent and highly available. All cluster changes are recorded in etcd.

Scheduler (kube-scheduler)

Role: Assigns pods to worker nodes
Watches for newly created pods with no assigned node and selects the best node based on resource requirements, affinity, taints/tolerations, and other constraints.

Controller Manager (kube-controller-manager)

Role: Runs controller processes
Includes controllers like Node Controller, Replication Controller, Endpoints Controller, and Service Account Controller. Each controller watches the cluster state and makes changes to move towards the desired state.

Cloud Controller Manager

Role: Cloud provider integration
Bridges Kubernetes with cloud provider APIs for load balancers, storage volumes, and node management. Only required when running on a cloud provider like AWS, GCP, or Azure.
# Check control plane component status kubectl get componentstatuses # View control plane pods in kube-system namespace kubectl get pods -n kube-system # Check API server health kubectl cluster-info # View leader election status kubectl get leases -n kube-system
Worker Node Components

Worker nodes are the machines where containerized applications run. Each worker node contains the necessary services to run pods, report status, and handle communication with the Control Plane.

Kubelet

Role: Primary node agent
Ensures containers are running in a pod. Communicates with the API Server to receive pod specifications and reports node status. Responsible for container lifecycle management.

Container Runtime

Role: Runs containers
The container runtime (containerd, CRI-O, Docker) pulls images, creates and runs containers. Must be CRI (Container Runtime Interface) compliant.

CNI Plugins

Role: Container networking
Implements the Container Network Interface (CNI) specification, providing network connectivity to pods. Examples: Calico, Flannel, Weave, Cilium.

kube-proxy

Role: Network proxy and load balancer
Maintains network rules on nodes. Implements service abstraction by forwarding traffic to the correct pods. Supports iptables, IPVS, and userspace modes.
# Check node status kubectl get nodes # View node details kubectl describe node <node-name> # Check kubelet logs journalctl -u kubelet -f # View node capacity and allocatable resources kubectl describe node <node-name> | grep -A 5 "Capacity"
Control Plane Communication Flow

Understanding how components communicate is essential for debugging and designing resilient clusters.

Request Flow

1. kubectl → API Server - User sends command via kubectl or API client

2. API Server → etcd - Validation, authentication, and persistence to etcd

3. Controller Manager ← API Server - Watches for changes and reconciles state

4. Scheduler ← API Server - Watches for unscheduled pods

5. Scheduler → API Server - Assigns node and updates pod spec

6. Kubelet ← API Server - Receives pod assignments and reports status

7. Kubelet → Container Runtime - Pulls images and starts containers

Security Note: All control plane-to-node communication is secured via TLS certificates. The API Server is the only component that directly communicates with etcd, providing a security boundary.
etcd: The Cluster State Store

etcd is a distributed, reliable key-value store that serves as Kubernetes' backing store for all cluster data. It's a critical component that requires special attention in production.

Consistency

etcd uses the Raft consensus algorithm to maintain consistency across nodes. All changes go through the leader, ensuring a single source of truth.

Performance

etcd is optimized for low latency and high throughput. Uses bbolt for storage and supports watch for real-time change notifications.

High Availability

For production, run etcd in a cluster of 3 or 5 nodes. This provides fault tolerance while maintaining quorum for writes.

Storage

etcd stores data in the default directory /var/lib/etcd/. Monitor disk I/O as etcd is I/O sensitive. Use SSD storage for production.
# Check etcd health etcdctl endpoint health --cluster # Check etcd member list etcdctl member list # Backup etcd ETCDCTL_API=3 etcdctl snapshot save snapshot.db # Restore etcd ETCDCTL_API=3 etcdctl snapshot restore snapshot.db --data-dir /var/lib/etcd-backup # Check etcd metrics curl http://localhost:2382/metrics
Critical: etcd is the single source of truth for your cluster. Always back up etcd regularly and monitor its health. Losing etcd data means losing your entire cluster state, including all configurations and secrets.
API Server: The Cluster Gateway

The kube-apiserver is the central management component of Kubernetes. It exposes the Kubernetes API and serves as the front-end for the control plane.

Authentication & Authorization

Supports multiple auth methods: client certificates, bearer tokens, OIDC, webhook. Authorization uses RBAC, ABAC, or webhook to determine permissions.

Admission Control

Admission controllers intercept requests after authorization. They can mutate (MutatingAdmissionWebhook) or validate (ValidatingAdmissionWebhook) resources before they're persisted.

API Aggregation

Supports extending the API with custom resources (CRDs) and API aggregation. This allows building platform-specific extensions without modifying core Kubernetes.

Rate Limiting

Implements rate limiting to protect the cluster from abuse. Uses API priority and fairness to ensure critical requests are served even under load.
# Check API server flags kubectl get pod kube-apiserver-<id> -n kube-system -o yaml # Check API server logs kubectl logs -n kube-system kube-apiserver-<id> # Enable audit logging # Add audit policy to API server config # Check API server metrics kubectl get --raw /metrics
Scheduler: The Pod Placement Engine

The kube-scheduler is responsible for assigning pods to nodes based on resource requirements, policies, and constraints. It's a critical component for workload distribution.

Filtering Phase

Filters nodes that can run the pod based on resource requests, node selectors, taints/tolerations, and affinity/anti-affinity rules.

Scoring Phase

Scores filtered nodes based on priority functions (resource balance, affinity, and custom scoring). The node with the highest score is selected.

Extensibility

Scheduler can be extended with custom plugins for filtering and scoring. This enables custom scheduling policies for specific workloads.

Scheduling Policies

Supports preemption (evict lower-priority pods) and pod priority. Can be configured with custom scheduler profiles.
# Check scheduler logs kubectl logs -n kube-system kube-scheduler-<id> # View scheduling events kubectl get events --field-selector involvedObject.name=<pod-name> # Check scheduler configuration kubectl get configmap scheduler-config -n kube-system -o yaml # Force pod rescheduling kubectl delete pod <pod-name>
Cloud Provider Integration

Kubernetes integrates with cloud providers (AWS, GCP, Azure, etc.) for infrastructure services like load balancers, persistent volumes, and node management.

Cloud Controller Manager

The cloud-controller-manager runs cloud-specific controllers. It allows Kubernetes to interact with cloud provider APIs for provisioning resources.

Storage Provisioning

Cloud providers offer CSI (Container Storage Interface) drivers for dynamic provisioning of persistent volumes. Examples: EBS, GCE PD, Azure Disk.

Load Balancers

Service type LoadBalancer provisions cloud load balancers (AWS ELB, GCP L7, Azure LB) to expose services externally.

Node Management

Cloud providers handle node lifecycle: adding nodes to the cluster, detecting node failures, and managing node labels and taints.
# Check cloud provider configuration kubectl get configmap cloud-config -n kube-system -o yaml # View service with load balancer kubectl get svc # Check storage classes kubectl get storageclass # View cloud provider metrics kubectl get --raw /metrics | grep cloud
High Availability Architecture

Production Kubernetes clusters require high availability (HA) to ensure cluster resilience during failures. HA clusters run multiple control plane nodes and distribute workloads across worker nodes.

Multi-Master Control Plane

Run 3 or 5 control plane nodes for HA. Use a load balancer in front of the API Servers. etcd runs on separate nodes or co-located.

etcd Cluster

etcd runs as a cluster (3 or 5 nodes) for consensus. Ensure etcd nodes are on separate machines and networks for fault isolation.

Network Redundancy

Use redundant network paths, load balancers, and multi-AZ deployments to prevent single points of failure.

Disaster Recovery

Implement etcd backup and restore procedures. Document cluster recovery procedures and test them regularly.
# Check etcd cluster health etcdctl endpoint health --cluster # Check control plane status kubectl get componentstatuses # View leader election kubectl get leases -n kube-system # Test failover by draining a control plane node kubectl drain <control-plane-node> --ignore-daemonsets
Frequently Asked Questions
What is the difference between Control Plane and Worker Nodes?
Control Plane nodes manage the cluster state and orchestrate workloads. They run components like API Server, etcd, Scheduler, and Controller Manager. Worker nodes run the actual application workloads and communicate with the Control Plane via the API Server.
Can I run Kubernetes without etcd?
No. etcd is the core state store for Kubernetes. All cluster state is stored in etcd, making it mandatory for any Kubernetes cluster.
What happens if the API Server goes down?
If the API Server is unavailable, users can't interact with the cluster, and controllers can't make changes. However, existing workloads continue running on worker nodes. This is why HA with multiple API Server replicas is critical.
How many control plane nodes do I need for HA?
For high availability, you need at least 3 control plane nodes (odd number). This ensures quorum for etcd and provides fault tolerance. 5 nodes is recommended for larger clusters.
What's the role of the Cloud Controller Manager?
The Cloud Controller Manager integrates Kubernetes with cloud provider APIs. It manages cloud resources like load balancers, persistent volumes, and node lifecycles, allowing Kubernetes to leverage cloud infrastructure.
How does the scheduler decide where to place pods?
The scheduler uses a two-phase process: filtering (finds nodes that can run the pod) and scoring (ranks filtered nodes). It considers resource requests, node affinity/anti-affinity, taints/tolerations, and custom scheduling policies.
What is etcd's role in the Kubernetes cluster?
etcd is the distributed key-value store that holds all cluster state. It stores configuration, secrets, and the status of all resources. The API Server reads from and writes to etcd, making it the source of truth.
Can I use a different container runtime with Kubernetes?
Yes, Kubernetes supports any container runtime that implements the CRI (Container Runtime Interface). Common options include containerd (default), CRI-O, and Docker (via cri-dockerd).
Back: Kubernetes Production Patterns Next: Kubernetes Objects

Understanding Kubernetes architecture is the foundation for building and operating production-grade clusters. Master these concepts to design resilient, scalable, and secure container orchestration platforms.