Infra

Infrastructure, GPU, compute, deployment

29 links across all digests

From Week 17, 2026

Bloomberg · 6 min read

DeepSeek V4 Pro and Flash arrive — open-weights frontier at a fraction of the price (opens in new tab)

DeepSeek dropped V4 in two preview models today — V4-Pro (1.6T total parameters, 49B active) and V4-Flash (284B, 13B active), both 1M-context MoE under an MIT license. Pro is now the largest open-weights model in circulation, larger than Kimi K2.6 and GLM-5.1, and its $1.74-per-million-input pricing undercuts every Western frontier. Self-reported benchmarks trail GPT-5.4 and Gemini-3.1-Pro by three-to-six months — but at this price gap, the spread barely matters for most production workloads.

Models OSS Infra

From Week 17, 2026

TechCrunch · 6 min read

Google Cloud launches TPU 8t and 8i, plus Gemini Enterprise Agent Platform (opens in new tab)

At Google Cloud Next, Google unveiled two new TPU generations — 8t for training and 8i for inference — and opened Gemini Enterprise Agent Platform with Model Garden access to 200+ models. The clearest direct shot at Nvidia's AI-compute moat Google has taken yet.

Infra Hardware Agents

From Week 17, 2026

Anthropic · 4 min read

Anthropic expands Google and Broadcom partnership for multi-gigawatt TPU compute (opens in new tab)

Anthropic locked in multiple gigawatts of next-generation TPU compute from Google and Broadcom starting 2027, underpinning its $30B-plus run-rate revenue plan. Notable because it anchors Anthropic's capacity story without deepening the Nvidia dependency everyone else is working through.

Infra Funding

From Week 17, 2026

MIT Technology Review · 10 min read

The current state of AI, told through charts (opens in new tab)

MIT Technology Review distills the macro picture of AI in spring 2026 into a single chart pack — compute trends, model spend, public opinion, and the gap between expert and lay confidence. The clearest at-a-glance reference for the trajectory of the field right now, and the one you'll keep sending to non-AI colleagues asking what's going on.

Research Models Infra

From Week 17, 2026

Simon Willison · 6 min read

DeepSeek V4 — almost on the frontier, a fraction of the price (opens in new tab)

The clearest practitioner teardown of DeepSeek V4-Pro and V4-Flash that exists the morning they shipped — pricing comparison table, quantization notes, MoE activation math, and the pelican-on-a-bicycle test. The single pricing table alone reframes every model-selection conversation you'll have next week: Flash at $0.14 input beats GPT-5.4 Nano, Pro at $1.74 input beats every other frontier model outright.

Models OSS Infra

From Week 17, 2026

Bismarck Analysis · 12 min read

AI 2026: Mistral will rise as compute is unleashed (opens in new tab)

A contrarian argument from a geopolitical-strategy shop: Mistral is the structurally best-placed lab to benefit from the coming compute surplus, because European sovereignty buyers will pay a premium to not depend on US hyperscalers.

Models Infra Funding

From Week 16, 2026

Anthropic · 5 min read

Anthropic commits $50 billion to American AI infrastructure (opens in new tab)

Anthropic announced a $50 billion commitment to US-based data centers and compute capacity, its largest infrastructure play to date. The investment anchors the company's long-term training roadmap and positions it to serve a fast-growing enterprise footprint domestically.

Infra Funding

From Week 16, 2026

Vercel · 7 min read

Agentic infrastructure: coding agents now drive 30% of Vercel deployments (opens in new tab)

Hard numbers on the agent-driven shift: deployments initiated by coding agents grew 10x in six months and now account for roughly 30% of Vercel's platform traffic. Includes the architectural tweaks the team made to absorb it.

Agents Infra Dev

From Week 15, 2026

CNBC · 5 min read

Broadcom expands chip deals with Google and Anthropic (opens in new tab)

Broadcom agreed to produce future Google AI chip versions and expanded its deal with Anthropic, giving the startup access to 3.5 gigawatts of compute capacity using Google's AI processors.

Infra Funding

From Week 15, 2026

OptionsAI · 8 min read

AI's next investment phase: More physical than digital (opens in new tab)

After years of software dominance, capital is shifting toward robotics, physical infrastructure, and hardware integration. The buildout mirrors 19th-century railroad investment—excess capacity unlocks unseen uses.

Funding Hardware Infra

From Week 15, 2026

The Neuron · 7 min read

MIT: Control theory simplifies AI models during training (opens in new tab)

MIT researchers applied control theory to prune unnecessary complexity from AI models during training, cutting compute costs without sacrificing performance. Elegant mathematical approach to efficiency.

Research Infra

From Week 15, 2026

The Neuron · 6 min read

Cursor's Warp Decode: MoE inference optimization on Blackwell (opens in new tab)

Cursor shipped Warp Decode, a kernel design that reorganizes MoE inference for 1.8x higher throughput and improved numerical accuracy on Nvidia Blackwell GPUs. Hidden gem for optimization fans.

Infra Tools

From Week 15, 2026

KubeSimplify Diaries · 8 min read

KubeSimplify: April 8 infrastructure and platform developments (opens in new tab)

Individual take on this week's infrastructure plays: Anthropic's Glasswing cybersecurity initiative with 12 partners, and how the AI stack is evolving from cloud-first to infrastructure-native.

Infra Dev

From Week 14, 2026

OpenAI · 6 min read

OpenAI raises $122B at $852B valuation to fuel compute expansion (opens in new tab)

OpenAI closed the largest private funding round in history — $122B at an $852B valuation — with revenue reportedly hitting $2B a month and 900M weekly active users. The capital is earmarked for compute infrastructure as the company consolidates around a unified product strategy following Sora's shutdown. The round signals that investors still see massive upside in scaling raw compute.

Funding Infra

From Week 14, 2026

CNBC · 3 min read

NVIDIA invests $2B in Marvell for NVLink Fusion interconnects (opens in new tab)

NVIDIA poured $2B into Marvell to accelerate NVLink Fusion chip interconnection technology, doubling down on the networking layer that makes multi-GPU clusters work.

Hardware Infra