Zoox's Cortex AI serves 100+ teams on isolated network

Zoox presented Cortex, an internal AI gateway supporting multiple LLM providers and agentic workflows with dozens of tools. Staff Software Engineer Amit Navindgi introduced the system at QCon San Francisco in November 2025; by March 2026, the platform served more than 100 internal clients. The system operates inside an autonomous vehicle company with binding constraints: all data stays on-network (vehicle telemetry, rider PII, internal source code remain inside the perimeter), latency stays acceptable for interactive applications, and integrations run deep into Zoox-specific services.

The architecture integrates RAG pipelines for knowledge retrieval, multi-modal LLMs ingesting text, images, video, and audio, and an agent API layer that internal teams use to wire Zoox-specific tools into model calls. Three constraints drove the design: on-network data residency, speed sufficient for interactive use, and deep integration to internal services.

FIG. 02 Cortex AI's four-layer architecture isolates all data on Zoox's internal network, from RAG retrieval through multi-modal LLMs to agentic routing. — Zoox Intelligence, QCon London March 2026

On the retrieval layer, RAG handles knowledge base integration. Fine-tuning is reserved for cases where a model must understand Zoox's autonomous driving behavior—something no document can teach. RAG answers "what does our system do and how" queries. Fine-tuning answers "understand how our vehicle drives" queries.

Before Cortex, new engineers required access to Confluence, GitHub, Slack, and scattered PDFs to find how systems worked. Getting new developers to ship meaningful code took one month or more. A support issue from an internal customer consumed half a day because information was fragmented across channels. Cortex targets both: faster discovery at onboarding and agent-assisted support triage. Adoption spread through AI champions embedded in teams and internal hackathons—a deliberate organizational strategy, not just a technology rollout.

The gap is explicit: Navindgi disclosed no latency, cost-per-query, or throughput numbers. For architects modeling the operational cost at 100-plus internal clients, this omission matters. The platform began as a basic inference API wrapper, added RAG pipelines, and evolved into an agentic gateway. That progression—wrap first, add retrieval, then orchestrate agents—matches what most enterprise AI platform teams are finding.

The shift from deterministic, rule-based workflows to autonomous agents introduces failure modes that rule-based systems don't have. Navindgi named this as the most critical challenge, but neither talk detailed production failure modes—the most transferable data for anyone designing similar systems.

Cortex's architecture—no frameworks, on-network, routing and RAG and agent tool registration owned in-house—is a bet to stay in control of security boundaries and model-provider flexibility. The cost: you build the orchestration layer yourself. If data gravity (PII, proprietary telemetry, regulated content) is the primary constraint, this design warrants examination before committing to an opinionated framework that assumes public API access.

Sources

Cortex serves more than 100 internal clients, supports multiple model providers, multiple modalities, and agentic workflows with dozens of tools
"At Zoox, that approach grew into Cortex, a production AI gateway supporting multiple model providers, multiple modalities, and agentic workflows with dozens of tools, serving over 100 internal clients."
qconlondon.com ↗
Cortex was built without external AI frameworks (QCon London, March 2026)
"Building an AI Gateway Without Frameworks: One Platform, Many Agents"
qconlondon.com ↗
Cortex integrates RAG, multi-modal LLMs, and contributor-friendly agent APIs
"He explains how they built Cortex, a secure platform integrating RAG, multi-modal LLMs, and contributor-friendly agent APIs."
infoq.com ↗
RAG works well for knowledge base integration; fine-tuning is reserved for domain-specific autonomous driving use cases that RAG cannot handle
"RAG has worked well for us so far. Fine-tuning is a big undertaking, and it typically wouldn't work well for things like knowledge base integration. RAG is a proven architecture... For use cases where a model has to understand, for instance, Zoox's driving, that cannot be done using RAG."
infoq.com ↗
Getting to the point where a new developer can ship meaningful code can take one month or more on some teams
"Getting to the point where they can ship meaningful code can easily take one month or more than that in some teams."
infoq.com ↗
A single support issue can burn half a day because information is scattered across systems
"A single support issue can easily burn half a day, because information is scattered across systems."
infoq.com ↗
Adoption was driven through AI champions and hackathons
"He shares practical strategies for driving adoption through AI champions and hackathons, emphasizing the move from deterministic workflows to autonomous agents."
infoq.com ↗
The platform requires all data to stay on-network; vehicle data, rider PII, and internal code cannot be passed to public tools
"We can't just paste sensitive code or customer data into a public tool. We have enterprise constraints. We need to make LLMs accessible, but we need to get there safely."
infoq.com ↗
Zoox Intelligence is a company-wide initiative applying LLMs across engineering, operations, customer support, and autonomy; Amit Navindgi presented at QCon San Francisco in November 2025
"Amit Navindgi is a Staff Software Engineer at Zoox, where he leads Zoox Intelligence — an initiative applying Large Language Models (LLMs) across engineering, operations, customer support, and autonomy."
qconsf.com ↗
The Zoox Intelligence initiative has been running for over a year
"Over the past year, Zoox has invested in integrating Large Language Models (LLMs) into internal developer workflows through a company-wide initiative called Zoox Intelligence (ZI)."
qconsf.com ↗

Written and edited by AI agents · Methodology

Zoox's Cortex AI serves 100+ teams on isolated network

Get the signal before the noise.

Get the signal before the noise.