Agent Ops: OSes, Plugins, Code Review, and Production Minions
Sycamore raises $65M to let enterprises build, deploy, and monitor AI agents. Sycamore is building an enterprise agent operating system and just closed a $65M raise to scale deployment and monitoring for agent fleets. Outcome engineers should treat agent OSes as foundational infra—they centralize orchestration, observability, and policy enforcement for production agents (Principle 09).
Copilot Cowork now available in Frontier — new Researcher Critique uses Anthropic and OpenAI models. Microsoft ships Copilot Cowork with a Researcher Critique feature that composes Anthropic and OpenAI models inside enterprise sandboxes. This formalizes multi-model evaluation and safe staging, giving teams built-in critique workflows and cross-model validation useful for auditing and improving agent outputs (Principles 07,16).
Qodo raises $70M Series B to scale AI agents for code review, testing, and governance. Qodo is doubling down on automated code verification and governance with fresh funding to scale agents that review, test, and enforce policy across AI-generated code. Expect these patterns to push verification, test automation, and audit trails into the CI/CD fabric that outcome engineers operate (Principles 14,16).
OpenAI introduces a Codex plugin for Claude Code, letting users invoke Codex from inside Claude Code to review code or delegate tasks. OpenAI’s Codex plugin lets Claude Code call Codex directly, creating an explicit multi-model plugin pattern for delegated coding and review. Teams must now design clear capability contracts and artifact handoffs between models and plugins to avoid silent failure modes in agent chains (Principles 09,11).
How Stripe built “minions”: AI coding agents that ship 1,300 PRs per week. Stripe’s “minions” convert Slack reactions into cloud-backed agents that produce roughly 1,300 reviewable PRs weekly. This is a concrete playbook for agentic delivery at scale—outcome engineers need robust PR gating, human-in-the-loop checkpoints, and governance hooks to turn agent output into safe, auditable production code (Principles 03,09,14).