← Latest Update

Outcome Ops: Inference tiers, RAG alternatives, replayable agents, security

Google adds Flex and Priority inference tiers to Gemini API for enterprise cost and reliability control. Google introduces Flex and Priority inference tiers on Gemini API to let teams trade cost for latency and availability. Outcome engineers can use these tiers to budget agent SLAs and avoid noisy-neighbor failures in orchestration systems (Principle 12).

We replaced RAG with a virtual filesystem for our AI documentation assistant. Mintlify swaps out RAG for a virtual filesystem that lets agents grep/ls/cat docs instantly, cutting boot time to ~100ms and cost to zero. That pattern gives agents a predictable, low-latency interface to source-of-truth docs — a practical Map + Tech Island move for doc-driven agents (Principles 06,07).

Karpathy shares ‘LLM Knowledge Base’ architecture that bypasses RAG with an evolving markdown library maintained by AI. Karpathy proposes an LLM-maintained Markdown knowledge base that compiles, lints, and links content to replace RAG for mid-sized datasets. This shifts context engineering toward writable, versioned artifacts you can audit and test, tightening documentation and validation workflows (Principles 13,02).

Async Python Is Secretly Deterministic. DBOS describes making async Python workflows replayable by deterministically assigning step IDs before awaits, enabling reliable checkpointed recovery. Determinism makes long-running, multi-agent workflows debuggable and fault-tolerant — a concrete step toward Order and durable orchestration (Principles 12,14).

Vulnerability Research Is Cooked. Simon Willison and Thomas Ptacek argue coding agents rapidly accelerate exploit discovery and will reshape vulnerability economics. Outcome engineers must treat agent tooling as a dual-use vector: harden pipelines, lock down model capabilities, and build an immune system that audits and restricts agent-initiated code actions (Principle 14).