Building the Sandbox: Docker, Firewalls, and One Shell Script to Rule Them All

The container is the boundary. Here’s how we built a firewalled, dependency-cached, multi-repo Docker environment that provisions in under a minute.

In Part 1, we argued that running Claude Code autonomously on a bare host is a liability — the agent will discover and use credentials it was never told about. The responsible move isn’t avoiding the flag; it’s building a perimeter. Now let’s build it.

Design Constraints

Before writing a single line of config, we set five hard requirements:

Isolated enough for safety — no host filesystem access beyond the workspace, no unrestricted network
Connected enough for real work — must reach GitHub, npm, PyPI, the Anthropic API, and Firebase
Fast to create — developers spin up several workspaces per day; anything over 60 seconds is too slow
Reproducible — same base image, same pre-installed dependencies, every time
Multi-repo aware — all repos checked out on the same feature branch, visible from one root

And one default that shapes everything: non-root by default. The user inside the container has exactly one elevated privilege, and we’ll get to that.

Approach Overview

Before we dive into specifics, here’s the high-level map of how the pieces fit together.

The entire system is orchestrated by a single shell script: workspace.sh. When you run ./workspace.sh create feature-x, it:

Creates git worktrees (not full clones) for each repo on the target branch — fast, lightweight, sharing the same .git object store
Assigns a unique port range so multiple workspaces can run simultaneously without collisions — workspace 1 gets ports 3101/8081, workspace 2 gets 3102/8082, and so on
Generates environment-specific configs: a .env file with the right ports and URLs, a devcontainer.json tailored to the workspace, and VSCode settings including color-coded titlebars
Generates VSCode tasks that auto-start dev servers when the folder opens
Creates a terminal launch configuration with a pre-built layout for monitoring

The VSCode tasks deserve a callout: when Claude Code starts working autonomously, there’s no human to type npm run dev or flask run. The tasks auto-execute on folder open, so the frontend dev server, backend API, and database are already running from second one. The workspace is ready before Claude writes its first line of code.

Everything builds on top of a Docker dev container with pre-baked dependencies and an iptables firewall. The container is the security boundary; the generated configs make each workspace unique and self-sufficient.

Let’s walk through each layer, starting with the container.

The Dockerfile — Baking Dependencies

The base image starts from Node. On top of that, we install the tools Claude Code needs to do real work: zsh, and Claude Code itself (pre-installed globally).

The critical optimization is dependency caching. Python and Node dependencies are pre-installed into a /home/deps/ layer during image build:

# Pseudo-code: Dependency layering

IMAGE LAYER 1: base (node:22)
IMAGE LAYER 2: system tools (iptables, tmux, zsh, etc.)
IMAGE LAYER 3: Claude Code (global npm install)
IMAGE LAYER 4: /home/deps/
    ├── python-venv/     # pip install from backend/requirements.txt
    └── node_modules/    # pnpm install from frontend/package.json

AT STARTUP (not build time):
    symlink /home/deps/python-venv  -> workspace/backend/.venv
    symlink /home/deps/node_modules -> workspace/frontend/node_modules

The symlink step takes milliseconds. No npm install at startup. No pip install. The workspace is ready the moment the container starts.

When dependencies change, ./workspace.sh build rebuilds the shared image. Existing workspaces detect image drift and warn the developer. But day-to-day, the image is static and fast.

The container runs as a non-root node user. That user has exactly one sudo privilege: executing the firewall initialization script. Nothing else.

The Firewall — Default REJECT, Explicit Allow

This is the layer that makes --dangerously-skip-permissions responsible instead of reckless.

The firewall runs at container startup via the devcontainer postStartCommand. It uses iptables with ipset (type hash:net) for efficient IP range matching. The logic:

┌──────────────────────────────────────┐
│  iptables OUTPUT chain               │
│                                      │
│  Packet destination in ipset?        │
│  ├── YES ──> ACCEPT                  │
│  └── NO  ──> REJECT (fast failure)   │
│                                      │
│  Whitelisted:                        │
│  ✓ GitHub       (dynamic /meta API)  │
│  ✓ npm registry                      │
│  ✓ PyPI                              │
│  ✓ Anthropic API                     │
│  ✓ Google OAuth (broad CIDRs)        │
│  ✓ Docker DNS   (127.0.0.11)         │
│  ✓ Host network (local DB/Redis).    │
│  ✓ Etc                               │
│                                      │
│  Everything else:                    │
│  ✗ REJECTED                          │
└──────────────────────────────────────┘

A few details worth calling out:

REJECT, not DROP. This is important. A DROP policy silently swallows packets, Claude Code would hang waiting for timeouts. REJECT sends an immediate “connection refused,” giving Claude Code a clear signal that the endpoint isn’t reachable. Fast failure, fast adaptation.

Verification is built in. The firewall script ends by testing that xxx.com is blocked and xxx.com is reachable. If either check fails, the script exits non-zero and the developer sees it immediately.

The Drawbacks, Honest Assessment

This setup works well for us, but it’s not free:

Firewall maintenance is ongoing. Every new external service requires a whitelist update.

Resource consumption adds up. Each workspace is a full Docker container. At 4-5 parallel workspaces on a 32GB machine, RAM gets tight. We haven’t hit a wall yet, but it’s the next scaling constraint.

Setup complexity is real. The system spans a Dockerfile, a firewall script, a dependency-linking script, a devcontainer template, environment variable generation, and port allocation. Debugging a failure means understanding all six layers. New team members need a walkthrough.

We accept these tradeoffs because the alternative — running --dangerously-skip-permissions on an uncontained host — is worse by every measure.

workspace.sh — One Command to Rule Them All

The shell script is the glue. It’s split into focused modules:

workspace.sh
├── config.sh        # Shared constants, paths, color palette
├── utils.sh         # Helper functions
├── build.sh         # Docker image rebuild
├── create.sh        # Full workspace provisioning
├── env.sh           # Environment variable generation
├── devcontainer.sh  # devcontainer.json templating
├── remove.sh        # Teardown and cleanup
└── list.sh          # Show all workspaces

The lifecycle of ./workspace.sh create feature-x:

1. Create git worktrees for all repos on branch "feature-x"
2. Assign port range (base + offset per workspace)
      Workspace 1: frontend=3101, backend=8081
      Workspace 2: frontend=3102, backend=8082
      Workspace 3: frontend=3103, backend=8083
3. Generate .env from template, substituting ports and URLs
4. Generate devcontainer.json with workspace-specific config
5. Generate VSCode tasks.json (auto-start dev servers)
6. Create Warp terminal launch configuration
7. Done. Open in Cursor/VSCode to start container.

./workspace.sh remove feature-x reverses it: tears down git worktrees, removes the container, cleans up the Warp config.

./workspace.sh list shows all active workspaces with their branches and assigned ports.

./workspace.sh build rebuilds the shared Docker image when dependencies change.

The key insight: developers never interact with Docker, git worktrees, or networking directly. They run one command and get an isolated, firewalled, dependency-ready environment. The complexity is real, but it’s encapsulated.

In Part 3, we’ll show what the daily workflow actually looks like once the infrastructure is in place — managing parallel agents across color-coded workspaces, using CLAUDE.md as a control plane for steering autonomous sessions, and the lessons we learned the hard way.

Building the Sandbox: Docker, Firewalls, and One Shell Script to Rule Them All

Published by Mariano Alvarez on April 8, 2026

Design Constraints

Approach Overview

The Dockerfile — Baking Dependencies

The Firewall — Default REJECT, Explicit Allow

The Drawbacks, Honest Assessment

workspace.sh — One Command to Rule Them All

0 Comments

Leave a Reply Cancel reply

Uncategorized

1,000 Applications, 2 Real Candidates: How AI Broke Tech Hiring in 2026

Uncategorized

We Gave an AI Agent Root Access. Less Crazy Than It Sounds.

Uncategorized

2020 – Year in Remote Recap

Building the Sandbox: Docker, Firewalls, and One Shell Script to Rule Them All

Published by Mariano Alvarez on April 8, 2026

Design Constraints

Approach Overview

The Dockerfile — Baking Dependencies

The Firewall — Default REJECT, Explicit Allow

The Drawbacks, Honest Assessment

workspace.sh — One Command to Rule Them All

0 Comments

Leave a Reply Cancel reply

Related Posts

Uncategorized

1,000 Applications, 2 Real Candidates: How AI Broke Tech Hiring in 2026

Uncategorized

We Gave an AI Agent Root Access. Less Crazy Than It Sounds.

Uncategorized

2020 – Year in Remote Recap