Container Security: Running as Non-Root

The Default: Root

Most containers run as root by default.

Everyone knows it's bad. Nobody does anything about it.

When a compromise happens-and it will-the attacker gets root inside your container. Then they can:

Access the host kernel
Attempt container escape
Move to other containers on the node
Potentially compromise the entire cluster

But most teams wave it off: "It's a container, so it's isolated anyway."

That's not how it works.

Container Isolation: Strong But Not Magic

Container isolation is strong. The Linux kernel isolates processes well.

But it's not magic. A process running as root has more power to exploit kernel bugs and escape.

Security is layers. Each layer blocks a percentage of attacks. Together, they make it difficult to compromise.

Running as root removes one critical layer.

The Fix #1: Run as Non-Root User

Most container images run as root. You need to change that.

When you build your Docker image, add a user:

FROM node:24
...
RUN useradd -m appuser
WORKDIR /app
COPY --chown=appuser:appuser app /app
USER appuser
CMD ["node", "server.js"]

Now your container runs as appuser, not root.

Your application still works. But now:

Attacker gets non-root access
Can't directly modify system files
Can't install packages
Can't load kernel modules
Can't escape container as easily

Go further: Google Distroless

Distroless images contain only your application and its runtime - no shell, no package manager, no utilities. Nothing for an attacker to use.

# Build stage: full image with tools
FROM node:24 AS build
WORKDIR /app
COPY package*.json ./
RUN npm ci --omit=dev
COPY . .

# Runtime stage: distroless, no shell
FROM gcr.io/distroless/nodejs24-debian13
WORKDIR /app
COPY --from=build /app /app
USER nonroot
CMD ["server.js"]

The final image has no bash, no sh, no apt. If an attacker gets in, they have nowhere to go.

The Fix #2: Read-Only Filesystem

By default, the container filesystem is writable.

An attacker can:

Modify your application code
Plant backdoors
Create startup scripts that persist across restarts
Write to temp directories and execute from there

Make the filesystem read-only:

In your Kubernetes deployment config, add:

securityContext:
  readOnlyRootFilesystem: true

Now the attacker can't write to disk. They can't plant backdoors. They can't persist.

They can still execute in-memory attacks, but they can't leave traces.

The Fix #3: Drop Unnecessary Linux Capabilities

Containers inherit Linux kernel capabilities. Most don't need them.

If you're running a web server, why does it need:

CAP_NET_ADMIN (network administration)
CAP_SYS_ADMIN (system administration)
CAP_SYS_PTRACE (process tracing)
CAP_CHOWN (change file ownership)

Drop all capabilities you don't need. Keep only what's required.

Most web servers need no capabilities at all - use port ≥ 1024 and a reverse proxy instead.

Fewer capabilities = fewer ways for attacker to exploit.

The Fix #4: Enforce Pod Security Standards

Don't leave security to individual developers. Enforce standards across the cluster.

Pod Security Standards (or Policies) can enforce:

All pods run as non-root
All pods have read-only filesystem
All pods drop unnecessary capabilities
No privileged containers
No container escalation

Set these policies and new deployments must comply. No exceptions.

Why Teams Skip This

"It's complicated"

It's not. It's configuration.

Running as root is the default. Running as non-root takes 30 seconds to add to your Dockerfile.

"Our app needs root"

It doesn't. If it does, you have a build problem.

Why does your app need to:

Listen on port < 1024? (Use port >= 1024, or use a reverse proxy)
Write to /etc? (Use a volume or /tmp)
Install packages at runtime? (Include them in the image)
Change file ownership? (Don't)

Most "needs root" is actually bad application design.

"We'll do it later"

You won't. Do it now. It takes 5 minutes.

The Real Risk

Most container compromises happen because:

Running as root - Exploit now has root privileges
Unnecessary capabilities - Exploit can do more (change ownership, load modules, etc)
Writable filesystem - Exploit can persist (plant backdoors, modify code)

All three are preventable with basic configuration.

But teams don't do it. Then they get compromised and say "containers are insecure."

No. Configuration is insecure.

Implementation Checklist

Dockerfile: Create non-root user, switch to it with USER
Dockerfile: Don't run as root
Kubernetes: Add securityContext with runAsNonRoot: true
Kubernetes: Add readOnlyRootFilesystem: true
Kubernetes: Drop capabilities (if needed)
Cluster: Implement Pod Security Standards
Test: Verify application still works
Document: Why you chose these settings

The Mindset Shift

Don't treat containers like lightweight VMs where you run anything.

Treat containers like production Linux systems where you follow security best practices.

You wouldn't run a production application as root on a Linux server. Why do it in a container?

The only difference is scale. Container compromises are worse because one compromised container can compromise the entire node.

TL;DR

Run as non-root user. Add USER directive to Dockerfile.
Make filesystem read-only. Add readOnlyRootFilesystem: true to securityContext.
Drop unnecessary capabilities. Most apps need none - use port ≥ 1024 and a reverse proxy.
Enforce with Pod Security Standards. Don't leave it to developers.
Test everything. If app breaks, fix the app, not the security policy.
Container security isn't complex. It's negligent to skip it.

Tags: Kubernetes · container security · Docker · DevOps · security · infrastructure · pod security · Linux