Filesystem Illusion Inside Containers
When I first started working with containers, I had a fundamental misunderstanding. I thought Docker was somehow packaging entire operating systems into these lightweight bundles. How else could an Ubuntu container run on my CentOS server? It took me months to realize I was looking at this completely backwards.
Containers don't contain operating systems. They contain the illusion of operating systems. & once you understand how this illusion is crafted, you'll never look at containers the same way again.
Misconception That Started Everything
Let's start with what containers aren't. They're not mini virtual machines. They don't have their own kernels. They don't even have their own filesystems, not really. When you run docker run ubuntu:latest
, you're not booting Ubuntu. You're starting a process that thinks it's running on Ubuntu.
This realization hit me when I discovered that every container on my system was sharing the exact same kernel:
So how does this work? How can the same kernel present completely different realities to different processes?
Kernel's Greatest Feature - Namespaces
The secret lies in Linux namespaces - a kernel feature that lets you create parallel universes for processes. Think of namespaces as filters that change what a process can see, without changing what actually exists.
Here's the key insight: when you create a new namespace, you're not creating new resources. You're creating a new view of existing resources.
Let me show you the most important namespace for containers - the mount namespace
We've just created two different realities. Same filesystem, same files, but different views of where /tmp
points.
Dissecting the Container Creation Process
Now let's build our own container runtime to understand exactly how this illusion is constructed. But instead of following the typical tutorial approach, let's reverse-engineer it by asking: "What would a process need to believe it's running in a completely different Linux distribution?"
Question 1: "What makes a Linux distribution unique?"
From a process perspective, it's mostly files in specific locations:
/etc/os-release
- tells you what distro this is/bin
,/usr/bin
- where programs live/lib
- where libraries live/etc
- where configuration lives
Question 2: "What kernel interfaces does every Linux process expect?"
/proc
- process & system information/dev
- device files/sys
- system information & control
Question 3: "What makes a process feel isolated?"
- Its own process tree (PID namespace)
- Its own hostname (UTS namespace)
- Its own network interfaces (network namespace)
- Its own view of mounted filesystems (mount namespace)
Let's build this step by step, but with a twist - we'll do it by creating the minimum viable illusion.
Building Minimum Viable Container
Instead of following a recipe, let's think like a magician. What's the smallest trick we can perform that makes a process believe it's in a different world?
Foundation with Different Root
Every Linux process believes the world starts at /
. If we can change what /
points to, we can change everything the process sees:
Congratulations! You've just created the world's simplest container. A process started here would think the entire universe consists of one text file.
Making It Believable by Adding Expected Pieces
Of course, a real container needs to be more convincing. Let's add the pieces that make a Linux environment feel real:
Devil in the Details
What we've built works, but it's missing the subtle touches that make containers production-ready. Real container runtimes handle dozens of edge cases:
- Device Management - Containers need specific device files but not others
- Security Boundaries - Some parts of
/proc
&/sys
are too dangerous to expose - Resource Limits - The container should feel isolated but not escape resource controls
- Network Plumbing - Containers need their own network stack but ways to communicate
Here's how we add just one of these - proper device management:
Understanding Layers
Layers are a storage optimization, not a container requirement. Our container works perfectly without any layer system. We just copied files into a directory. Docker's layers are about sharing common files between containers & managing updates efficiently, not about creating isolation.
You could run production containers with our simple approach. You'd just use more disk space & have slower startup times.
Storage Illusion: Volumes vs Bind Mounts
Here's another place where Docker's marketing creates confusion. Docker presents "volumes" & "bind mounts" as fundamentally different concepts:
But under the hood? They're identical. Both are just bind mounts created before the pivot_root
operation. The only difference is that Docker manages the host directory location for "volumes" (usually under /var/lib/docker/volumes/
), while "bind mounts" let you specify the host path directly.
When you understand that containers are just filtered views of the host filesystem, this makes perfect sense. Whether the source directory is /home/user/data
or /var/lib/docker/volumes/abc123/_data
, the mechanism is identical: mount --bind source destination
.
Of course, Docker's abstraction is useful for lifecycle management, portability & driver support.
It's All About Perspective
What we've built demonstrates the profound insight that containers are fundamentally about perspective. We haven't created new operating systems or even new filesystems. We've created new viewpoints on the same underlying system.
This perspective shift explains so many container behaviors that seem mysterious:
- Why containers start so fast (no OS to boot)
- Why they share resources so efficiently (same kernel)
- Why security is both easier & harder (shared kernel, isolated view)
- Why networking is complicated (network namespaces vs. shared networking)
- Why volumes & bind mounts are the same thing (both are just directories made visible; difference is in management)
From Docker Command to Running Container
Now that we understand the fundamental mechanisms, let's see how Docker orchestrates all these pieces when you run a simple command. The journey from docker run
to a running container involves multiple layers, but at its core, it's still just the namespace & mount tricks we've been exploring.
Here's the complete flow:
This entire complex orchestration is doing one fundamental thing: creating a sophisticated set of "perspective filters" for a single process. The process believes it owns an entire Linux system, but it's really just looking at the world through these carefully crafted illusions.
The actual container creation happens in that bottom section - the runc
part. Everything else (Docker daemon, containerd) is just management, orchestration, & image handling. The real magic still comes down to the same Linux primitives we've been exploring:
clone()
&unshare()
- create the namespace filtersmount()
&pivot_root()
- switch the filesystem viewexecve()
- start the process in its new reality
Docker's complexity isn't in creating containers - it's in managing them at scale.
Why This Matters
Container technology isn't magic - it's an elegant application of existing Linux kernel features. Understanding the underlying mechanisms helps you:
- Debug container issues by understanding what isolation is missing
- Make informed decisions about container security models
- Optimize container performance by understanding the overhead sources
- Design better container-based systems by working with the abstractions, not against them
The illusion is sophisticated, beautiful in its simplicity, & powerful in its application.