DeepSeek dropped V4 in two preview models today — V4-Pro (1.6T
total parameters, 49B active) and V4-Flash (284B, 13B active),
both 1M-context MoE under an MIT license. Pro is now the
largest open-weights model in circulation, larger than Kimi K2.6
and GLM-5.1, and its $1.74-per-million-input pricing undercuts
every Western frontier. Self-reported benchmarks trail GPT-5.4
and Gemini-3.1-Pro by three-to-six months — but at this price
gap, the spread barely matters for most production workloads.
At Google Cloud Next, Google unveiled two new TPU generations —
8t for training and 8i for inference — and opened Gemini
Enterprise Agent Platform with Model Garden access to 200+
models. The clearest direct shot at Nvidia's AI-compute moat
Google has taken yet.
Anthropic locked in multiple gigawatts of next-generation TPU
compute from Google and Broadcom starting 2027, underpinning
its $30B-plus run-rate revenue plan. Notable because it anchors
Anthropic's capacity story without deepening the Nvidia
dependency everyone else is working through.
MIT Technology Review distills the macro picture of AI in
spring 2026 into a single chart pack — compute trends, model
spend, public opinion, and the gap between expert and lay
confidence. The clearest at-a-glance reference for the
trajectory of the field right now, and the one you'll keep
sending to non-AI colleagues asking what's going on.
The clearest practitioner teardown of DeepSeek V4-Pro and
V4-Flash that exists the morning they shipped — pricing
comparison table, quantization notes, MoE activation math, and
the pelican-on-a-bicycle test. The single pricing table alone
reframes every model-selection conversation you'll have next
week: Flash at $0.14 input beats GPT-5.4 Nano, Pro at $1.74
input beats every other frontier model outright.
A contrarian argument from a geopolitical-strategy shop:
Mistral is the structurally best-placed lab to benefit from
the coming compute surplus, because European sovereignty
buyers will pay a premium to not depend on US hyperscalers.
Anthropic announced a $50 billion commitment to US-based data centers
and compute capacity, its largest infrastructure play to date.
The investment anchors the company's long-term training roadmap and
positions it to serve a fast-growing enterprise footprint domestically.
Hard numbers on the agent-driven shift: deployments initiated by
coding agents grew 10x in six months and now account for roughly 30%
of Vercel's platform traffic. Includes the architectural tweaks the
team made to absorb it.
Broadcom agreed to produce future Google AI chip versions and expanded
its deal with Anthropic, giving the startup access to 3.5 gigawatts
of compute capacity using Google's AI processors.
After years of software dominance, capital is shifting toward robotics,
physical infrastructure, and hardware integration. The buildout mirrors
19th-century railroad investment—excess capacity unlocks unseen uses.
MIT researchers applied control theory to prune unnecessary complexity
from AI models during training, cutting compute costs without
sacrificing performance. Elegant mathematical approach to efficiency.
Cursor shipped Warp Decode, a kernel design that reorganizes MoE
inference for 1.8x higher throughput and improved numerical accuracy
on Nvidia Blackwell GPUs. Hidden gem for optimization fans.
Individual take on this week's infrastructure plays: Anthropic's
Glasswing cybersecurity initiative with 12 partners, and how the
AI stack is evolving from cloud-first to infrastructure-native.
OpenAI closed the largest private funding round in history — $122B
at an $852B valuation — with revenue reportedly hitting $2B a month
and 900M weekly active users. The capital is earmarked for compute
infrastructure as the company consolidates around a unified
product strategy following Sora's shutdown. The round signals that
investors still see massive upside in scaling raw compute.
NVIDIA poured $2B into Marvell to accelerate NVLink Fusion chip
interconnection technology, doubling down on the networking layer
that makes multi-GPU clusters work.