Air-Gapped AI Agents on Windows: Enterprise Zero-Trust Setup

Medium by Raja Das April 6, 2026

TL;DR

A technical executive built a fully isolated, zero-trust AI agent environment on Windows using OpenClaw + Ollama + Podman, demonstrating that enterprise-grade local AI deployment is technically feasible but requires navigating underdocumented integration layers.

The Enterprise Case for Air-Gapped AI

Most enterprise AI deployments remain cloud-dependent, sending sensitive data to third-party APIs for processing. For regulated industries—clinical trial data, proprietary formulations, M&A intelligence, supply chain decisions—this dependency is increasingly untenable. Regulatory frameworks (EU AI Act, FDA guidance for medical device software, SEC interpretations of AI-generated disclosures) are already requiring organizations to demonstrate where AI reasoning happens, on what data, and with what controls.

Local AI agents—models running on enterprise infrastructure in containers with explicit network boundaries—are no longer hobbyist experiments. They represent the architectural pattern that will underpin compliant AI in regulated sectors. This build proves the pattern works on consumer hardware.

The Technical Stack

The build used four core components: OpenClaw (open-source AI agent framework), Ollama (local LLM server without cloud APIs), Podman (rootless container runtime—more secure than Docker for enterprise use), and WSL2 + Ubuntu (Linux on Windows). The goal was complete isolation: no data egress, no external API calls, model running locally, agent sandboxed in a container.

Phase 1: Windows Subsystem for Linux Configuration

The first obstacle was environmental. Installing Ubuntu 24.04 via WSL threw an "Invalid distribution name" error, blocked by an outdated WSL installation. The fix required a WSL update followed by a simplified distribution reference. That resolved, the default user account lacked sudo privileges—the Ubuntu install completed without prompting for user creation. Manual intervention was required: dropping into the root shell, creating the user account, and granting sudoers permissions.

For executives: this represents "shadow IT done correctly." The friction is real, but it's the friction of doing things securely—containerized, permissioned, auditable. The alternative is employees pasting sensitive data into consumer ChatGPT with no controls.

Phase 2: Rootless Podman and Container Runtime Configuration

Podman is the enterprise choice because it runs without a root daemon—no elevated background process, no persistent system service attack surface. Containers run as the invoking user. However, rootless Podman requires careful environment configuration.

The first blocker: XDG_RUNTIME_DIR ownership errors. This Linux environment variable tells processes where to find the current user's runtime directory. Without it set correctly, Podman cannot operate. The permanent fix required exporting the variable and creating the directory with correct permissions in ~/.bashrc.

The build then failed on an OpenClaw dependency: pnpm's canvas library broke during container image construction. A clean rebuild with the --no-cache flag resolved it. Sandbox mode (the agent's tool execution environment, distinct from the container boundary itself) was temporarily disabled to isolate error sources; container-level isolation remained intact throughout.

Phase 3: Ollama Integration and Networking

Ollama is elegant software—pull a model, run a server, point your application at it. The complexity came from containerization: a process inside a container cannot reach localhost the way external processes can. The solution is host.docker.internal, a magic hostname that bridges the isolated container to the host machine's localhost. This is standard Docker behavior and supported in Podman on Windows.

Before networking could work, missing dependencies had to be installed (zstd before running the Ollama installer). Then OpenClaw's device authentication system—designed to prevent unauthorized access—kept rejecting the connection. This required disabling device auth for local-only deployment and resetting device state. Model provider configuration couldn't be set via UI (validation was rejecting the Ollama endpoint format); CLI commands were necessary:

The final integration blocker: OpenClaw was writing HEARTBEAT.md to the agent context on every cycle, flooding working memory with noise. Disabling heartbeat pulses improved performance and usability.

Model Selection: The Performance-Capability Tradeoff

Initial testing with llama3.2:3b (3-billion parameter model) produced 20-30 second response times on CPU. The UI timeout triggered before responses completed. Switching to llama3.2:1b (1-billion parameter variant) brought latency into acceptable interactive ranges.

The tradeoff is explicit: smaller models execute faster but are less capable. For structured data extraction, template generation, classification, and summarization of well-defined inputs, 1b parameter models perform adequately. Complex reasoning, code generation, and nuanced analysis require larger models on GPU infrastructure. The architecture scales; the laptop proof-of-concept is not the production target.

Why This Matters for the AI Ecosystem

Three critical insights emerge from this build: (1) The tooling exists, but integration documentation does not. OpenClaw, Ollama, Podman, and WSL2 are all mature, open-source, and enterprise-viable. The integration layer—getting them to communicate across the WSL/container/host boundary—is underdocumented. Teams will hit these same walls. Budget for debugging time. (2) The security model is sound. Rootless Podman + local Ollama + zero external APIs constitutes a legitimate zero-trust architecture. Data never leaves the container. The attack surface is small and auditable. (3) Model selection is a strategic policy question, not a technical one. Which model to run, on what hardware, at what parameter count—these are decisions between infrastructure and AI strategy teams. Executives must own the policy layer: acceptable latency, minimum capability threshold, which data can run on-device versus requiring cloud infrastructure.

For regulated industries (clinical document processing, supply chain anomaly detection on sensitive data, proprietary formulation assistance, drug safety signal analysis), this pattern is not just preferable—it may become mandatory. Compliance teams and IT security are wrestling with this now.

Key Takeaways

Enterprise-grade air-gapped AI deployment is technically viable today using open-source components (OpenClaw, Ollama, Podman, WSL2), but requires navigating significant underdocumented integration points.
Rootless Podman provides the security-first container architecture that regulated industries require—no root daemon, no persistent attack surface, auditable data boundaries.
Local LLM inference on consumer hardware is feasible with smaller models (1b parameters) for structured tasks; larger models and complex reasoning require GPU infrastructure but use the same reproducible architecture.
The regulatory landscape (EU AI Act, FDA medical device guidance, SEC AI disclosures) is already demanding demonstration of where AI reasoning occurs and what controls govern it—this build pattern directly addresses that requirement.
Model selection is a business and compliance decision, not a technical one; teams must establish policy on acceptable latency, minimum capability, and which workloads can run locally versus requiring cloud infrastructure.

Source: Raja Das, Medium, April 2026

Read original