The Inference Perimeter Closes: When Routing, Sandboxing and Signed Skills Become One Stack

01 · Context

Four announcements that all describe the same object

Read the week of May 24 to May 30, 2026 as a single object and four facts stop looking unrelated. On May 26, OpenRouter closed a USD 113M Series B led by CapitalG, with NVentures, ServiceNow Ventures, MongoDB Ventures, Snowflake Ventures and Databricks Ventures on the cap table — pricing the inference-gateway category at USD 1.3B on weekly traffic of 25T tokens routed across more than four hundred models.

One week earlier, at Code with Claude London on May 19, Anthropic published two enterprise features for Claude Managed Agents: self-hosted sandboxes in public beta, where tool execution moves out of Anthropic infrastructure and into customer-controlled runtimes such as Cloudflare, Daytona, Modal and Vercel, and MCP tunnels in research preview, which let agents reach private MCP servers without exposing them to the public internet.

Three days later, on May 22, NVIDIA released Verified Agent Skills: a SkillSpector vulnerability and prompt-injection scanner, cryptographic signing for every skill, and a machine-readable Skill Card that declares ownership, dependencies, limits and verification status. Capability governance moved from documentation into the wire format.

Then, on May 27, a federal judge in the Northern District of California issued a tentative ruling in Mobley v. Workday that Workday must defend AI hiring-discrimination claims under California's Fair Employment and Housing Act. Workday acknowledged in its own filings that 1.1B applications were screened through its tooling in the relevant window.

Inference is no longer an API call. It is a perimeter — routed, sandboxed, signed, and as of May 27, formally liable.

02 · Framework

The three-layer Inference Perimeter Stack

The reason these four announcements describe the same object is that they each instrument a different face of the same control surface. The frontier-model API call of 2024 was a one-line transaction: prompt out, completion back. The enterprise inference object of 2026 is a three-layer perimeter, and the announcements of the week each ship one of its layers into production-grade form.

Layer 1 — Routing

The routing valve decides which model serves which decision, on what substrate, at what cost, under which jurisdiction. OpenRouter's 25T weekly tokens, with hyperscaler-grade backing and a multi-provider failover surface, codify routing as a category — not a script. The routing decision is now a policy artifact: prompt-to-model is the new prompt-to-system.

Layer 2 — Execution

The execution boundary determines where tool calls actually run, where private data sits during inference, and which network policies bind the agent's reach. Anthropic's self-hosted sandboxes and MCP tunnels move that boundary back inside the enterprise. Microsoft's computer-use agents in Copilot Studio, generally available since May 13 with Azure Key Vault credentials and Purview audit logging, do the same for desktop and browser automation.

Layer 3 — Provenance

The provenance layer answers a different question: what is this agent allowed to do, who signed for it, and what changed since last scan? NVIDIA's Verified Agent Skills bind a cryptographic signature, a SkillSpector report and a Skill Card manifest to every reusable capability. Camunda's ProcessOS, announced on May 20 in Amsterdam, frames the same point at the workflow level — a process becomes a governed, discoverable artifact rather than a hand-coded BPMN file.

So what: the model is no longer the moat. The perimeter around the model is. Procurement should now ask three questions in this order — where does the routing decision live, where does the tool call execute, and who signs the skill — and any vendor unwilling to answer all three should be priced as a higher-risk counterparty.

03 · Use Cases

Three LATAM patterns that close the perimeter

CABA fintech BNPL — perimeter underwriting under Ley 25.326. A Buenos Aires consumer-finance platform deploys Anthropic self-hosted sandboxes inside its own VPC for credit-decision agents, with an OpenRouter-class gateway in front routing 62% of traffic to a regional smallest-sufficient model and 38% to frontier for edge cases. Cost-per-decision falls 44% while in-jurisdiction data residency rises to 100%; Tier-2 advisory human-in-the-loop holds override rate at 5.6% and pass^k at k=5 above 0.92. The decision ledger and signed skill cards make every approval auditable under EU AI Act Article 14 and Ley 25.326 Article 11.

São Paulo industrial logistics — verified-skill gateway in front of computer-use agents. A multi-warehouse operator wraps Microsoft Copilot Studio computer-use agents around legacy WMS and TMS terminals that never exposed APIs. Every action is gated by NVIDIA-class signed skill cards, registered in an internal capability registry, and logged through Microsoft Purview. Throughput on inbound dock scheduling rises 31%, mean time to resolve exception cases drops from 42min to 9min, and forecast error on inbound volumes falls 26%. Vendor concentration sits below 60% because the same skill cards run against an alternative model when the gateway routes around it.

Multi-country LATAM bank — three-tier perimeter with sovereign fallback. A regional banking group routes non-regulated customer-experience workloads through a frontier substrate, regulated credit and AML workloads through a sovereign substrate hosted at the CENIA Tarapacá facility using Latam-GPT, and on-prem document workloads through a small-language-model tier behind MCP tunnels. Portfolio cost falls 39%, sovereign-substrate coverage reaches 34% in regulated workloads, identity-attested action ratio exceeds 99%, and decision auditability is held at 100% across EU AI Act Article 14, LGPD Article 20 and Ley 25.326 Article 11 mappings.

04 · Implementation

Implementation: govern the perimeter as a contract, not a committee

The first implication of the Mobley v. Workday tentative ruling, and of Gartner's May 26 warning that uniform agent governance will produce enterprise agent failure, is that the perimeter must be expressible as machine-readable policy — not as a slide deck or a steering-committee charter. The routing rules, the sandbox boundaries and the signed skill manifests are the policy.

The second implication is that AI-vendor liability is no longer a contract negotiation in isolation. It is a function of how much of the perimeter the vendor controls. The more layers a single counterparty owns end-to-end, the more concentrated the legal exposure.

So what: KPIs before APIs. Measure the perimeter as carefully as the model. Coverage, substitution latency and skill-provenance ratio are the new board-level metrics — not benchmark scores.

Governance

Risk-tier the perimeter by reversibility and blast radius. Tier 1 autonomous-with-logging, Tier 2 advisory HITL, Tier 3 approval-required, routed by the same gateway that selects the model. Map each tier to EU AI Act Article 14, LGPD Article 20 and Ley 25.326 Article 11 explicitly. Treat the Mobley v. Workday ruling as a binding precedent on the legal weight of the perimeter, regardless of jurisdiction.

KPIs

Perimeter coverage ratio above 90% of production agent traffic. Vendor concentration below 60%. Substitution latency under 30 days. Signed-skill ratio at 100% in regulated workloads. Cost-per-decision delta of at least 35% versus baseline. Override rate under 8%. Decision auditability at 100%. Sovereign-substrate coverage of at least 30% on regulated flows.

12-month roadmap

0–90: inventory every inference path, tag by data residency and risk tier, audit signed-skill coverage, baseline cost-per-decision and substitution latency. 90–180: deploy a routing gateway with policy-as-code, sandbox top-three high-risk workloads, integrate Skill Cards into the registry, run the first dual-substrate failover drill. 180–360: reach ≥60% smallest-sufficient-model coverage, light a sovereign Latam-GPT fallback for regulated flows, report perimeter KPIs quarterly to the board alongside concentration and override metrics.

Socradata Perspective

Buy the model. Own the perimeter. Sign the skill.

The dominant signal of the week is not that another foundation model got cheaper, faster or more capable. It is that the inference layer acquired a shape. Four announcements that would have read as unrelated in 2025 — a routing gateway raising growth capital, a frontier lab moving execution inside customer firewalls, a silicon vendor signing capability cards, and a federal court holding an HR-tech provider directly liable for AI decisions — describe in 2026 the same object viewed from four sides. The object is the perimeter, and it is now the unit of enterprise AI procurement, governance and risk.

For LATAM operators, the perimeter conversation is also a sovereignty conversation. The same architectural primitives that let a São Paulo industrial logistics network keep its inference inside LGPD residency, a CABA fintech satisfy Ley 25.326 on every credit decision, and a multi-country bank route regulated workloads to the CENIA Tarapacá supercomputer running Latam-GPT, are the primitives that let any operator anywhere absorb the next vendor sunset, the next regulatory shift, the next class-action suit. Interoperability or it doesn't scale. From pilot to policy. The board metric that matters by the third quarter of 2026 is not adoption — it is perimeter coverage.

Diagnose the perimeter before the next ruling

Socradata runs structured Inference Perimeter Diagnostics for LATAM enterprises in financial services, industrial logistics and the public sector. The output is a perimeter map, a routing policy, a signed-skill plan and a quarterly KPI scorecard ready for board review under EU AI Act, LGPD and Ley 25.326.

Request an Operational Diagnostic