Understanding Docker Images
Docker images are the foundation of containerization. This guide explains what images are, how they're built from layers, how to tag and version them, and how to work with registries like Docker Hub.
A Docker image is a lightweight, standalone, executable package that includes everything needed to run a piece of software: code, runtime, system tools, libraries, and settings. Images are read-only templates used to create containers. Think of an image as a class definition and a container as an instance of that class.
Images are built from a series of layers. Each layer represents an instruction in the Dockerfile, such as installing a package, copying files, or setting environment variables. Layers are cached and reused across images, which saves disk space and speeds up builds. When you pull an image, you download these layers. When you build an image, you add new layers on top of existing ones.
Docker images use a Union File System to stack layers. Each layer is read-only and represents a change to the filesystem. When you run a container from an image, Docker adds a thin read-write layer on top of the stack. This is where the container writes data. This layered architecture has several benefits: layers are cached, so rebuilding is fast; layers are shared between images, saving disk space; and layers can be pushed and pulled independently.
Consider a typical Node.js application Dockerfile. The base layer is the Node.js image. Then a layer for package.json, then a layer for running npm install, then a layer for copying source code, and finally a layer for setting the start command. If you change only your source code, Docker reuses the cached layers for Node.js, package.json, and npm install—only rebuilding the COPY and CMD layers.
Container (Read-Write Layer)
+---------------------+
| Layer 5: CMD | ← Command to run
| Layer 4: COPY src | ← Your application code
| Layer 3: RUN npm install | ← Dependencies (cached)
| Layer 2: COPY package.json | ← Package definition
| Layer 1: FROM node:18 | ← Base image
+---------------------+
Image (Read-Only Layers)
Tags are how Docker versions images. A tag is a human-readable label attached to an image ID. The format is [registry/][namespace/]repository:tag. For example, docker.io/nginx:latest means registry docker.io, namespace nginx, repository nginx, tag latest.
The most important tag is latest, but using it in production is risky. Instead, use specific version tags like v1.2.3 or use git commit SHAs for immutability. Semantic versioning is recommended: myapp:1.0.0, myapp:1.0.1, etc. You can also apply multiple tags to the same image. For example, when you build version 1.0.0, you might tag it as both myapp:1.0.0 and myapp:latest.
# Tag an image
docker tag myapp:latest myregistry/myapp:v1.0.0
# View image history and tags
docker images
# Pull by specific tag
docker pull nginx:alpine
# Push with version tag
docker push myregistry/myapp:v1.0.0
# Apply multiple tags to same image
docker tag myapp:latest myapp:v1.0.0
docker tag myapp:latest myapp:stable
A registry is a storage and content delivery system for Docker images. It hosts repositories of images. The default public registry is Docker Hub at docker.io. You can also run private registries (like AWS ECR, Azure Container Registry, Google Container Registry) or host your own with Docker Registry.
Docker Hub has millions of public images, including official images from nginx, postgres, node, python, ubuntu, and many more. Official images are maintained by Docker and are considered secure and well-documented. When you use docker pull nginx, Docker pulls from Docker Hub.
For private images, you have several options: Docker Hub's private repositories (free for one private repo), cloud provider registries (ECR, ACR, GCR), or self-hosted registry. Private registries require authentication before pulling or pushing.
# Login to Docker Hub
docker login
# Login to AWS ECR
aws ecr get-login-password | docker login --username AWS --password-stdin .dkr.ecr.region.amazonaws.com
# Pull from a private registry
docker pull myregistry.azurecr.io/myapp:latest
# Push to a private registry
docker push myregistry.azurecr.io/myapp:v1.0.0
# Search for images on Docker Hub
docker search nginx
Here are the essential commands for managing Docker images day to day:
# List images
docker images
docker image ls
# Pull an image without running
docker pull nginx:alpine
# Build an image from a Dockerfile
docker build -t myapp:latest .
# Tag an existing image
docker tag myapp:latest myregistry/myapp:v1.0.0
# Push an image to a registry
docker push myregistry/myapp:v1.0.0
# Remove an image
docker rmi myapp:latest
# Remove unused images (dangling)
docker image prune
# Remove all unused images
docker image prune -a
# Inspect image details
docker inspect myapp:latest
# Show image history (layers)
docker history nginx:alpine
# Save image to tar file
docker save -o myapp.tar myapp:latest
# Load image from tar file
docker load -i myapp.tar
When using Docker Hub, you'll encounter two types of images: official and community. Official images are maintained by Docker in collaboration with software vendors. They follow best practices, are regularly updated, and are considered trustworthy. Examples include nginx, postgres, node, python, ubuntu, and redis.
Community images are published by users. They can be found under usernames like bitnami/nginx or linuxserver/plex. Community images vary in quality—some are excellent, others are outdated or insecure. Always check the Dockerfile, read the documentation, and look at the number of pulls and stars before using a community image in production.
Smaller images are faster to pull, faster to deploy, and have a smaller attack surface. Here are key strategies for reducing image size:
Use Alpine-based images. Alpine Linux is a security-oriented, lightweight Linux distribution. An Alpine image is about 5MB, compared to Ubuntu's ~70MB. Most official images have an Alpine variant: node:alpine, python:alpine, nginx:alpine.
Use multi-stage builds. Separate build-time dependencies from runtime dependencies. The build stage can have compilers, SDKs, and test frameworks; the final stage copies only the compiled artifacts. This eliminates build tools from the final image.
Combine RUN commands. Each RUN command creates a new layer. Combine commands with && and clean up temporary files in the same layer to avoid persisting them.
# Bad: Multiple layers, large image
FROM node:18
COPY package.json .
COPY package-lock.json .
RUN npm install
COPY . .
CMD ["npm", "start"]
# Good: Alpine base, combined RUN, multi-stage
FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json .
RUN npm ci --only=production
COPY . .
FROM node:18-alpine
WORKDIR /app
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/dist ./dist
CMD ["node", "dist/server.js"]
- Use specific tags, not just latest.
latestis ambiguous and can change unexpectedly. Tag images with semantic versions or git commit SHAs for reproducibility. - Keep images minimal. Remove unnecessary packages, use alpine variants, and clean up temporary files in the same layer.
- Don't add secrets to images. Never bake API keys, passwords, or certificates into images. Use environment variables, Docker secrets, or external secret stores.
- Scan images for vulnerabilities. Use
docker scan(Docker Scout) or tools like Trivy to check for known vulnerabilities. - Use .dockerignore. Exclude unnecessary files (node_modules, .git, .env) from the build context to speed up builds and reduce image size.
- Run as non-root user. Create and switch to a non-root user in your Dockerfile for better security.
docker history <image> to see each layer with its command and size. Add --no-trunc to see full command lines. This helps understand why images are large.docker build. The old image remains. This immutability is a feature—it ensures consistent, reproducible deployments.docker image prune to remove them.docker save -o myimage.tar myimage:tag to export the image to a tar file. Transfer the file, then use docker load -i myimage.tar to import it on another machine.rm -rf /var/lib/apt/lists/*), copying large files that aren't needed at runtime, or not using multi-stage builds.Mastering Docker images is essential for efficient container development. Understand layers, tag wisely, and keep your images small and secure.