Dockerfile Best Practices: multi-stage builds, layer caching, image size optimization

Why Dockerfile Best Practices Matter

A well-written Dockerfile produces images that are smaller, build faster, are more secure, and easier to maintain. Poorly written Dockerfiles can result in images that are gigabytes in size, have security vulnerabilities, and take minutes to build. The practices in this guide will help you create production-ready images that follow industry standards.

Optimizing Dockerfiles can reduce image size by 80-95%, cut build times by 50-75%, and significantly reduce your container's attack surface. These improvements have a direct impact on deployment speed and infrastructure costs.

Rule 1: Order Layers from Least to Most Frequently Changing

Docker caches each layer. If a layer hasn't changed, Docker reuses the cached version. To maximize cache hits, order your Dockerfile instructions from least frequently changed to most frequently changed. Put base images, environment variables, and dependency installation FIRST. Put your application code LAST.

This means: FROM → WORKDIR → ENV → COPY package.json → RUN npm install → COPY source code → CMD. This way, when you change only your source code, Docker reuses the cached layers for dependencies, saving significant build time.

                # Bad: Reinstalls dependencies on every code change
FROM node:18-alpine
COPY . /app
WORKDIR /app
RUN npm install
CMD ["npm", "start"]

# Good: Dependencies cached unless package.json changes
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
CMD ["npm", "start"]
            

Rule 2: Use Multi-Stage Builds to Eliminate Build Dependencies

Multi-stage builds allow you to use multiple FROM statements in a single Dockerfile. Each stage can have its own base image and instructions. You can copy artifacts from earlier stages into the final stage, leaving behind build tools, compilers, and intermediate files that aren't needed at runtime.

This is especially powerful for compiled languages (Go, Rust, Java, C++) and for frontend applications that need build tools like webpack. The final image contains only the runtime dependencies and compiled artifacts, dramatically reducing image size.

                # Multi-stage build for Go application
# Stage 1: Build (includes compilers)
FROM golang:1.21 AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o myapp .

# Stage 2: Runtime (no compilers, much smaller)
FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=builder /app/myapp .
EXPOSE 8080
CMD ["./myapp"]
            

Multi-stage builds can reduce image size from over 1GB to under 20MB for Go applications, and from 800MB to under 200MB for Node.js applications.

Rule 3: Minimize Layer Count and Clean Up in the Same Layer

Each RUN, COPY, and ADD instruction creates a new layer. While layers are not inherently bad, unnecessary layers increase image size and build complexity. Combine related commands into a single RUN using && and clean up temporary files in the same layer where they're created.

For apt-get, always combine update and install in the same RUN, and remove the package cache afterward. This prevents outdated cache from being frozen into the image layer.

                # Bad: Multiple layers, cache persists
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get install -y git
RUN rm -rf /var/lib/apt/lists/*

# Good: Single layer with cleanup
RUN apt-get update && \
    apt-get install -y curl git && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

# For npm: combine install and cache cleanup
RUN npm ci --only=production && \
    npm cache clean --force
            

Rule 4: Choose Minimal Base Images

Base images have dramatically different sizes. Ubuntu: ~70MB, Debian-slim: ~40MB, Alpine: ~5MB, Distroless: ~2MB. The smaller the base image, the smaller your final image and the smaller the attack surface. However, smaller images may have fewer debugging tools.

For production, use Alpine or Distroless. For development or debugging, use the full images. Always pin to specific version tags, never latest. This ensures reproducibility and prevents unexpected changes.

Base ImageSizeUse Case alpine:latest~5MBProduction, minimal attack surface node:18-alpine~50MBNode.js production python:3.11-slim~120MBPython production golang:alpine~200MB (build stage)Go builds distroless/static~2MBStatic binaries (no shell)

                # Recommended: Specific Alpine variant
FROM node:18.17.0-alpine3.18

# Also recommended: Slim variants
FROM python:3.11-slim-bookworm

# Distroless (no shell - highest security)
FROM gcr.io/distroless/static-debian11
            

Rule 5: Security - Run as Non-Root User

By default, containers run as root. This is a security risk—if an attacker compromises your container, they have root access. Create and switch to a non-root user before the CMD instruction. Use a specific UID (like 1000) rather than a username for better compatibility with volume mounts.

Also avoid storing secrets in images. Never hardcode passwords, API keys, or tokens. Use Docker secrets, environment variables, or external secret stores instead. Scan your images for vulnerabilities using Docker Scout or Trivy.

                # Create non-root user
FROM node:18-alpine

# Add non-root user (alpine syntax)
RUN addgroup -g 1001 -S nodejs && \
    adduser -S nodejs -u 1001

WORKDIR /app
COPY --chown=nodejs:nodejs . .

# Switch to non-root user
USER nodejs

CMD ["node", "server.js"]
            

Never run containers as root in production. Creating a non-root user takes two lines and dramatically improves security.

Rule 6: Use .dockerignore to Exclude Unnecessary Files

The .dockerignore file works like .gitignore. It prevents unnecessary files from being sent to the Docker daemon during docker build. This speeds up builds and keeps your image smaller by excluding files like node_modules, .git, logs, temporary files, and secrets.

                # .dockerignore file example
node_modules/
npm-debug.log
.git/
.gitignore
.env
*.md
Dockerfile
.dockerignore
.idea/
.vscode/
coverage/
.nyc_output/
dist/ (if you rebuild in Docker)
            

Rule 7: Add HEALTHCHECK for Production Containers

HEALTHCHECK tells Docker how to test if a container is still working properly. Without it, Docker only knows if the process crashed. With HEALTHCHECK, orchestration tools like Docker Swarm and Kubernetes can automatically restart unhealthy containers.

                # HEALTHCHECK examples
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD curl -f http://localhost/ || exit 1

# For Node.js applications
HEALTHCHECK --interval=30s --timeout=3s \
  CMD node -e "require('http').get('http://localhost:3000/health', (r) => {process.exit(r.statusCode === 200 ? 0 : 1)})"

# For databases
HEALTHCHECK --interval=30s --timeout=5s --retries=5 \
  CMD pg_isready -U postgres || exit 1
            

Complete Best Practices Example: Node.js Production Dockerfile

                # Stage 1: Build
FROM node:18-alpine AS builder

WORKDIR /build

# Copy dependency files first (for caching)
COPY package*.json ./
RUN npm ci

# Copy source and build
COPY . .
RUN npm run build
RUN npm prune --production

# Stage 2: Production
FROM node:18-alpine

# Create non-root user
RUN addgroup -g 1001 -S nodejs && \
    adduser -S nodejs -u 1001

WORKDIR /app

# Copy built artifacts
COPY --from=builder --chown=nodejs:nodejs /build/package*.json ./
COPY --from=builder --chown=nodejs:nodejs /build/node_modules ./node_modules
COPY --from=builder --chown=nodejs:nodejs /build/dist ./dist

# Environment
ENV NODE_ENV=production
ENV PORT=3000

# Security
USER nodejs
EXPOSE 3000

# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD node -e "require('http').get('http://localhost:3000/health', (r) => {process.exit(r.statusCode === 200 ? 0 : 1)})"

# Start
CMD ["node", "dist/server.js"]
            

This image is: Small (Alpine base, multi-stage build), Fast (layer caching optimized), Secure (non-root user), Observable (health check), Reproducible (pinned versions).

Common Anti-Patterns to Avoid

Using :latest tag - Breaks reproducibility. Always pin to specific versions.
Running as root - Security risk. Use USER instruction to switch to non-root.
Copying entire directory before installing dependencies - Breaks layer caching. Copy dependency manifests first.
Multiple RUN statements that could be combined - Creates unnecessary layers and size.
Leaving temporary files and package caches - Clean up in the same layer.
No HEALTHCHECK - Orchestration systems can't detect hanging containers.
Storing secrets in images - Use Docker secrets or environment variables.
Not using .dockerignore - Sends unnecessary files to daemon, slowing builds.

Frequently Asked Questions

How much smaller can multi-stage builds make my images?

For compiled languages like Go, Rust, or C++, multi-stage builds can reduce image size from 1GB+ to under 20MB. For Node.js applications, you can go from 800MB to under 200MB. For Python, from 900MB to under 100MB. The savings are substantial.

Should I always use Alpine base images?

Alpine is excellent for production due to its small size and security. However, some npm packages have native dependencies that require compilation on Alpine, which can fail or be complicated. For those cases, use Debian-slim variants. Test with Alpine first; if it works, use it.

How do I view Docker build cache usage?

Use docker build --progress=plain to see detailed output showing which layers are cached. Lines with "CACHED" indicate cache hits. This helps you debug cache misses.

Why does my Docker build run apt-get update even when my packages haven't changed?

Because a previous instruction (like COPY) invalidated the cache. Move COPY commands that change frequently AFTER the RUN apt-get update instruction. The order in your Dockerfile matters.

What UID should I use for non-root users?

Use UIDs above 1000 (e.g., 1001). Avoid 0 (root), 1-100 (system users), and avoid 999 (often used by Docker itself). Consistency across containers helps with shared volume permissions.

How often should I rebuild my base images?

Rebuild weekly to pick up security patches. Use automated builds with Dependabot to monitor base image updates. For critical security patches, rebuild immediately.

What's the difference between COPY and ADD for best practices?

Best practice says use COPY unless you specifically need ADD's features (URL download or tar extraction). ADD has surprising behaviors (like auto-extracting archives) that can lead to unexpected results. COPY is simpler and more predictable.

How do I prevent secrets from leaking into images?

Use Docker BuildKit secrets: RUN --mount=type=secret,id=mysecret. Never use COPY or ADD for secrets. Use environment variables at runtime (not build time). Use external secret stores like HashiCorp Vault or cloud secret managers.

Previous: Dockerfile Basics Next: Multi-Stage Builds

Following these best practices transforms your Dockerfiles from functional to production-ready. Smaller images deploy faster, are more secure, and save infrastructure costs.