The week frontier AI got more expensive — and the alternatives got serious
OpenAI dobrou o preço da inteligência frontier; em 48 horas, open weights, mega-fusões e o moonshot do LeCun recapitalizaram o stack alternativo.
Transcript
OpenAI doubled the price of GPT-5.5 — and launched the model without an API. Anthropic quietly tested a 5x price hike on Claude Code and reversed it within hours. In 48 hours, DeepSeek open-sourced a model claiming parity with the best closed systems, Alibaba compressed 807 gigabytes of model into 55 gigabytes with better benchmark scores, Google committed up to $40 billion to Anthropic, Cohere and Aleph Alpha merged into a $20 billion rival — and Yann LeCun left Meta with $1 billion to bet that the entire LLM playbook is wrong.
Every procurement assumption you built in Q1 is being renegotiated in real time.
It's April 25th. You're listening to ai|expert Wire. We start with pricing.
OpenAI launched GPT-5.5 on April 23rd inside Codex and for paying ChatGPT subscribers. API access: unavailable for an indefinite period. The company stated that API deployments require different safeguards and that it will bring GPT-5.5 and GPT-5.5 Pro to the API "soon" — with no concrete date.
When the API arrives, pricing will be $5 per million input tokens and $30 per million output — double the GPT-5.4 rates of $2.50 and $15 respectively. The Pro version goes to $30 input and $180 output. And reasoning scales sharply: in maximum-effort mode, Simon Willison measured 9,322 reasoning tokens on a single task, versus 39 tokens in standard mode — a 239x difference. At $30 per million output tokens, that shows up fast on any corporate spend dashboard.
OpenAI offered an alternative route while the API is not yet available. The /backend-api/codex/responses endpoint — the same one used by the open-source Codex CLI — was publicly endorsed for third-party integrations by Romain Huet, head of developer relations. The statement names JetBrains, Xcode, and even Claude Code as approved integrators. Any subscriber can route prompts to GPT-5.5 today.
No published SLA, no rate limits, no versioning. This is open-source infrastructure that OpenAI chose not to block — not a supported product. For production workloads, treat it as sandbox-level access until the formal API arrives.
On the infrastructure side, NVIDIA deployed GPT-5.5 via Codex to all of its 10,000-plus employees, running on GB200 NVL72 hardware with a zero data-retention policy. This generation delivers 35x lower cost per million tokens and 50x more tokens per second per megawatt versus previous systems. NVIDIA's IT provisioned a dedicated cloud virtual machine for each employee — isolated, auditable sandbox with read-only access to production systems. Engineers report debug cycles that previously took days closing in hours, and week-long experiments completing overnight. Jensen Huang summarized in a company-wide email: "Let's leap to the speed of light."
While OpenAI raised prices with a public announcement, Anthropic tested a far larger hike — in silence. On April 22nd, the company updated the claude.com pricing page to restrict Claude Code to the $100-per-month Max plan — a 5x increase over the $20 Pro plan where the feature had lived. No blog post, no changelog, no email to existing subscribers.
The Internet Archive captured the page before the reversal. Amol Avasare, Anthropic's Head of Growth, was the only quasi-official voice, via tweet, describing the change as "a small test on around 2% of new subscribers." Simon Willison challenged it publicly: "I don't believe the '~2% of new subscribers,' because everyone I've spoken to is seeing the new pricing grid and the Internet Archive already has a copy."
Anthropic reversed within hours and had not issued a formal statement by the time this edition closed. Willison — who published 105 posts teaching Claude Code and ran a tutorial at the NICAR data journalism conference last month — put the strategic question plainly:
"Strategically, should I bet on Claude Code if I know they can multiply the product's minimum price by 5?"
Anthropic has not publicly committed to keeping Claude Code at the $20 tier. Any corporate build treating that entry point as stable is carrying undisclosed pricing risk.
The open-weight response came immediately. DeepSeek launched V4-Pro — 1.6 trillion total parameters, 49 billion active, mixture-of-experts architecture — and V4-Flash, with 284 billion total and 13 billion active. Both open-weight, both with API access available at launch.
DeepSeek claims that V4-Pro leads all open models in Math, STEM, and code, with parity against the best closed systems. On world knowledge, it trails only Gemini-3.1-Pro among all current models. A 1-million-token context window is now standard across all official services — a length most proprietary competitors charge as a premium tier.
Necessary caveat: results are self-reported in the technical report released alongside the models. Independent verification does not yet exist — but the open weights mean the community is already running evaluations. Results should surface within days. Operational note in the meantime: deepseek-chat and deepseek-reasoner are officially deprecated with a hard sunset on July 24, 2026. Any integration hardcoded to those model strings has fewer than three months to migrate.
In the same week, Alibaba published Qwen3.6-27B. The model scores 77.2% on SWE-bench Verified — surpassing its predecessor Qwen3.5-397B-A17B, which scored 76.2%. The predecessor weighs 807 gigabytes. Qwen3.6-27B weighs 55.6 gigabytes. The Q4_K_M quantized version fits in 16.8 gigabytes — a single consumer GPU. Willison measured 25.57 tokens per second running locally with llama.cpp.
A 14.5x reduction in file size between two consecutive open-weight coding flagships, with a better score on the leading agentic coding benchmark. The native context window reaches 262,144 tokens, extensible to 1 million. Teams evaluating multi-node infrastructure for code agents should run Qwen3.6-27B first.
On the capital side, the week operated at a different scale. Google will invest up to $40 billion in Anthropic — $10 billion immediately, at a $350 billion valuation, with up to $30 billion more tied to performance milestones. The deal includes a new commitment of 5 gigawatts of Google Cloud capacity over five years — on top of a prior partnership with Broadcom that a securities filing put at 3.5 gigawatts.
The resulting structure has no direct precedent in the AI industry. Google competes with Anthropic at the model level via Gemini, supplies the TPUs that underpin Claude's inference, and now holds the largest individual financial stake in the company. Those three simultaneous roles — model rival, infrastructure supplier, and largest investor — give Google competitive visibility into Anthropic's technical roadmap and pricing leverage over its competitor's cost structure.
Amazon added $5 billion to its own stake in Anthropic this week, part of a broader deal under which Anthropic is expected to commit up to $100 billion for approximately 5 gigawatts of compute capacity over time. Anthropic also closed a separate datacenter capacity deal with CoreWeave. The company now carries multi-gigawatt commitments from two of the three largest hyperscalers simultaneously. Anthropic's valuation stood at $350 billion in February; investors have since expressed interest at $800 billion or more, according to Bloomberg.
When a single hyperscaler can sell the chip, supply the model that competes with the model running on that chip, and hold equity in the competing startup — multi-vendor strategy stops being a preference and becomes procurement policy.
The European counterweight arrived the same week. Toronto-based Cohere and Germany's Aleph Alpha announced a merger into a company valued at $20 billion, anchored by a $600 million Series E from Schwarz Group — Europe's largest retailer, operator of Lidl and Kaufland across 32 countries. The deal has not yet closed and is subject to regulatory review.
The thesis is straightforward: give companies and governments an alternative to dominant U.S. players, with greater control over their data. Aleph Alpha already serves a government AI assistant with 80,000 public-sector users. A document intelligence agent at a major chip manufacturer cut search times by 90%. Governance is built for the EU AI Act, not retrofitted afterward. When Europe's largest retailer writes a $600 million check to fund a sovereign AI alternative, that is an operational bet — not a portfolio hedge.
And then there's Yann LeCun. The former chief AI scientist at Meta, Turing Award winner for foundational work in deep learning, left the company late last year and founded Advanced Machine Intelligence Labs — AMI Labs. The organization raised $1 billion with 12 employees, built on the conviction that large language models cannot deliver on their promises.
The architecture LeCun is building has six interchangeable modules: a domain-specific world model, an actor that proposes next steps via classical reinforcement learning, a critic that scores options against hard-coded rules, a perception layer for video, audio, image, or text, a short-term memory, and a configurator that orchestrates data flow between all the others. Expert modules — which do not need to operate as generalists — are expected to require only a few hundred million parameters, versus the hundreds of billions of ChatGPT. That enables on-device inference, eliminating a cost and latency variable that makes LLM deployments increasingly difficult to justify at scale.
LeCun stated that AMI Labs should not produce a marketable product for perhaps five years. Narrow, modular AI has a proven track record where generalist approaches stumble: reinforcement-trained systems in specific, well-defined domains have consistently outperformed generalist models in those contexts. LeCun's argument is that the same logic scales to enterprise verticals.
A billion dollars for 12 people with a five-year horizon — that is either high conviction in LeCun's track record, or a hedge against the LLM scaling ceiling arriving sooner than expected. Either way, it is a material market signal: if the modular approach produces a benchmark-level result in a real domain, pressure on the LLM consensus stops being theoretical.
That's the Wire for this week. GPT-5.5 without an API, price doubled. Anthropic tested a 5x hike and walked it back without explanation. Open-weight shrank 807 gigabytes to 55 with a better benchmark. Google became Anthropic's investor, supplier, and rival in a single deal. And LeCun bet a billion against the industry's core assumption. In Friday's edition, we go deeper: what Anthropic's all-Claude marketplace experiment reveals about agent-to-agent procurement, and how to rebuild your stack around this week's repricing. For reading ahead: our article on the Cohere–Aleph Alpha merger on the site — link in the episode notes. Until Friday.