Building AI Ops for Autonomous Software Development

When the Switch Product Development organization onboarded Claude Code and its powerful foundation models into our software development lifecycle, it fundamentally changed how we build software. AI agents could now write code, debug issues, and complete complex tasks autonomously with predictable quality guarantees. But this leap forward surfaced a critical question for any enterprise organization: How do we harness this transformative technology while maintaining safety, consistency, and control?

In the course of a month, the team recognized this gap and responded with notable speed. We built a complete end-to-end AI Ops platform from scratch — a system that provides the governance, infrastructure, and specifications needed to use AI assistants and agentic coding tools safely, consistently, and productively at scale.

The question was never whether to adopt AI code assistants — it was how to use them safely, consistently, and productively at enterprise scale.

The Four Pillars

Pillar 1

SGA — Governance Through Policy Compilation

SGA (Switch Governed Agents) eases the governance challenge by transforming organizational policy into runtime guidance that AI assistants automatically follow. Instead of asking developers to continuously remember and teach AI agents our rules and standards, we compile a central policy registry into governance artifacts that load automatically.

Policy maintainers define rules, behaviors, skills, and agent templates in a central registry stored in Git. Developers run sga sync to pull the latest policy, then sga apply to compile it into their workspace. This generates:

Compiled Agents — Role-specific prompts like @backend-implementer or @security-reviewer that automatically embed relevant rules
Auto-triggered Skills — Domain-specific guidance that activates when AI assistants encounter specific file types or code patterns
Slash Commands — On-demand checks like /sga-security for explicit verification
Policy Context — Baseline governance awareness that all agents inherit

This solves three critical problems: governance without friction (developers don't need to remember rules), consistency across teams (everyone uses the same compiled artifacts), and adaptability (policy updates propagate automatically).

Pillar 2

CNAM — Persistent Memory for Context Continuity

CNAM (Cloud-Native Agentic Memory) addresses a fundamental limitation of current AI assistants: they lack memory between sessions. Every interaction starts fresh, requiring users to repeatedly provide context. CNAM provides a multi-tenant memory system that AI assistants can query and update as they work through MCP.

remember — Store facts with optional tags, classification, and scope
recall — Search memories using semantic similarity, keyword matching, or hybrid approaches
forget — Remove specific memories by identifier
list_memories — Browse memories with filtering capabilities
share_memory — Change scope between personal and organizational
export_memories — Export for backup or analysis

Memories exist at two scopes: personal memories visible only to the individual user, and organizational memories shared across teams — enabling both private knowledge accumulation and collective institutional learning. The system classifies memories by type: factual (infrastructure standards, configurations, architectural decisions), experiential (debugging insights, lessons learned), and working (temporary session context that auto-expires). Built on PostgreSQL, Qdrant for vector search, and Ollama for embeddings — a production-grade cloud-native service with admin console, API key provisioning, and audit logging.

Pillar 3

Runcell — Secure Isolation for Autonomous Execution

Runcell enables the most powerful — and potentially risky — capability of AI assistants: autonomous execution. It creates isolated workspace containers where task execution and experimentation happens safely.

The architecture introduces a critical distinction: Outer Agent versus Inner Agent. The outer agent runs in your terminal, orchestrating work and creating workspaces. The inner agent runs inside those workspaces with its own API key, working autonomously on delegated tasks. This separation provides security isolation and enables parallel autonomous workflows.

Two deployment modes are supported. Local mode runs on individual machines using Podman for fast, network-independent workspace creation. Remote mode connects to a centralized Runcell server that orchestrates workspaces as Kubernetes pods, providing centralized management, shared infrastructure, and full audit logging.

Security is baked in through multiple layers: containers run as non-root users with dropped capabilities, seccomp profiles restrict dangerous syscalls, cgroups enforce resource limits, and networks can be fully isolated. Credentials are managed securely through the macOS Keychain and injected into containers at runtime.

Pillar 4

Simplex — High-Fidelity Specifications for Autonomous Agents

Simplex is a workflow specification designed specifically for autonomous agents. It occupies a crucial middle ground between natural language (too ambiguous) and programming languages (too prescriptive). Simplex captures what needs to be done and how to know when it's done, without prescribing how to do it.

The specification uses landmarks — structural markers like FUNCTION, RULES, DONE_WHEN, EXAMPLES, and ERRORS — that agents recognize and orient around. Five pillars guide the design:

Enforced simplicity — Complexity that cannot be decomposed is complexity not yet understood
Syntactic tolerance, semantic precision — Agents interpret what you meant, but the meaning itself must be unambiguous
Testability — Every function requires examples that serve as contracts defining correct output
Completeness — Specifications must be sufficient to generate working code without further clarification
Implementation autonomy — Simplex describes behavior and constraints, never implementation details

How It Works Together

The true power of the platform emerges when these components work in concert. Here's the end-to-end workflow:

Policy Definition & Compilation SGA

Policy maintainers define organizational rules in a central registry. Developers run sga apply to compile policy into their workspace, generating role-specific agents, auto-triggering skills, and baseline governance context.

Context Initialization CNAM

When an AI assistant starts a session, it recalls relevant memories such as project patterns, architectural decisions, and debugging insights that persist from previous sessions, providing immediate context without repetitive explanation.

Workspace Creation Runcell

When the AI assistant needs to execute code or interact with systems, it creates an isolated workspace container. Required credentials are injected securely, and the container provides a safe, ephemeral environment.

Autonomous Execution Simplex + Runcell

For complex tasks, the AI assistant invokes specialized agents compiled via SGA and delegates work to them. These agents may spawn inner agents in separate workspaces, each running autonomously. Simplex specifications provide the high-fidelity instructions they need.

Learning & Evolution CNAM

As work progresses, the AI assistant remembers new patterns, solutions, and decisions. Future sessions benefit from this accumulated knowledge without manual synchronization.

The Model Context Protocol (MCP) serves as the integration layer, standardizing how AI assistants communicate with these systems. CNAM provides MCP tools for memory operations, Runcell provides MCP tools for workspace and file operations, and SGA provides governance context that all agents automatically inherit.

Technical Innovation

                        
Compile-Time Governance
Traditional governance enforces rules at runtime through reviews and linters. SGA compiles governance into the AI assistant's context before any work begins — shifting from reactive policing to proactive guidance.

Semantic Multi-Tenant Memory
CNAM combines vector-based semantic search with role-based isolation, enabling personal knowledge accumulation and team knowledge sharing. Memory classification by type allows appropriate retention policies.

Inner / Outer Agent Architecture
The distinction between orchestrator and worker agents, each with isolated workspaces and credentials, enables both security and parallelism. Multiple autonomous agents work concurrently without interference.

LLM-Native Specification Language
Rather than forcing LLMs to parse formal grammars or ambiguous natural language, Simplex provides structural landmarks that agents recognize, syntactic tolerance for variation, and semantic precision for unambiguous meaning.

Cloud-native throughout. All components are designed for Kubernetes deployment — CNAM on CloudNativePG with Istio service mesh, Runcell orchestrating workspace pods, SGA artifacts version-controlled as governance-as-code in Git.

Business Impact

Accelerating Delivery While Maintaining Safety

By providing governance, memory, and isolation as infrastructure, we accelerate AI assistant adoption without compromising on safety or quality. Developers work more productively, confident that organizational standards are embedded automatically and execution happens in isolated, auditable environments.

Scaling AI Capabilities Across Teams

Policy updates propagate automatically. Shared memory enables organizational knowledge accumulation. Centralized Runcell infrastructure supports team collaboration and audit logging. We're building organizational capability, not just individual productivity.

Building Competitive Advantage Through Infrastructure

Most organizations approach AI code assistants as tools to be adopted. We've approached them as infrastructure to be built. Tools provide capabilities; infrastructure provides leverage. By building governance, memory, and isolation as platform services, we create leverage that compounds as adoption grows.

Positioning for the Future

We're seeing the beginning of autonomous agents that can perform complex software tasks independently. Our platform anticipates this — Simplex for specifications, Runcell for secure execution, CNAM for persistent memory, SGA for governed behavior. We're building the foundation for tomorrow's autonomous development workflows.

Looking Forward

What this platform enables next is transformative. Organizations can define Simplex specifications for complex software systems and have autonomous agents build them end-to-end, with governance automatically enforced, context accumulated across iterations, and execution safely isolated.

Teams can share knowledge through CNAM's org-scoped memories, avoiding the rediscovery of solutions to known problems. Policy can evolve centrally and propagate automatically, keeping governance in sync with organizational learning.

The broader vision for AI Ops is clear: create the infrastructure that enables safe, governed, productive use of AI across the entire software development lifecycle. Now we expand — integrating with CI/CD pipelines, extending to testing and deployment, enabling multi-agent swarm architectures, and building observability into AI-assisted workflows.

The question is no longer "Can we use AI assistants?" The question is "How can we use AI assistants to transform software development?" With our AI Ops platform, we have the answer.