DeepSeek dropped V4 in two preview models today — V4-Pro (1.6T
total parameters, 49B active) and V4-Flash (284B, 13B active),
both 1M-context MoE under an MIT license. Pro is now the
largest open-weights model in circulation, larger than Kimi K2.6
and GLM-5.1, and its $1.74-per-million-input pricing undercuts
every Western frontier. Self-reported benchmarks trail GPT-5.4
and Gemini-3.1-Pro by three-to-six months — but at this price
gap, the spread barely matters for most production workloads.
GPT-5.5 lands with stronger coding, longer-horizon reasoning,
and noticeably better agentic behavior — rolling out to Plus,
Pro, Business and Enterprise this week. Framed as a step toward
the ChatGPT super app strategy.
CBS News reports Anthropic is investigating a possible breach
of its Mythos model via a third-party vendor environment inside
the Project Glasswing program. The first visible crack in the
restricted-access model since Glasswing launched.
The most thorough walkthrough yet of Opus 4.7's Adaptive
Thinking, the new effort levels, and the 1M-context workflow
changes. Covers when to reach for which effort level, how
Adaptive Thinking actually allocates compute, and ten concrete
workflows that are newly viable. The kind of practitioner read
you come back to twice while rewiring your agents.
A Streamlit benchmark harness that measures how Opus 4.7's
memory and effort levels trade off on real tasks. Practical
numbers for anyone tuning latency-vs-quality knobs in an app.
MIT Technology Review distills the macro picture of AI in
spring 2026 into a single chart pack — compute trends, model
spend, public opinion, and the gap between expert and lay
confidence. The clearest at-a-glance reference for the
trajectory of the field right now, and the one you'll keep
sending to non-AI colleagues asking what's going on.
A clear read on why Meta's Muse Spark launch represents a real
re-entry to frontier LLMs, not a rebrand of existing work.
Distribution across WhatsApp, Instagram, and Messenger is the
quiet moat.
The clearest practitioner teardown of DeepSeek V4-Pro and
V4-Flash that exists the morning they shipped — pricing
comparison table, quantization notes, MoE activation math, and
the pelican-on-a-bicycle test. The single pricing table alone
reframes every model-selection conversation you'll have next
week: Flash at $0.14 input beats GPT-5.4 Nano, Pro at $1.74
input beats every other frontier model outright.
Zvi's second Mythos deep-dive picks apart the Project
Glasswing mechanism — what responsible capability overhang
looks like in practice, the argument for and against vendor
gating, and the uncomfortable questions restraint leaves
unanswered. The governance companion to every other Mythos
piece this week.
A contrarian argument from a geopolitical-strategy shop:
Mistral is the structurally best-placed lab to benefit from
the coming compute surplus, because European sovereignty
buyers will pay a premium to not depend on US hyperscalers.
Bloomberg's deep-dive on why Anthropic restricted Claude Mythos to
Project Glasswing instead of a public release. During red-team
evaluation, the model autonomously identified and exploited a
previously unknown FreeBSD RCE vulnerability, crossing the company's
ASL-4 threshold for cyber-capability. The piece is the clearest public
picture yet of how frontier labs are handling capability overhang.
Claude Opus 4.7 launched at the same pricing as 4.6 with meaningful
gains on software engineering benchmarks, improved vision, and new
"effort" controls that let developers tune reasoning depth per call.
Ben Thompson argues Anthropic's decision to hold back Claude Mythos
resets the competitive frame: capability is table stakes, trust is
the moat. The piece threads Glasswing, the $800B valuation, and the
cyber-capability disclosure into a single strategic narrative about
where the frontier goes next. The essential read of the week.
MIT Tech Review unpacks the growing gap between expert enthusiasm
(56% excited) and public skepticism (10% excited) using Stanford's
2026 AI Index data. A clear-eyed look at why a technical story and
a social story have diverged.
Data-rich companion piece breaking down benchmark compression,
the narrowed US-China gap (now under 3%), and adoption curves
across industries. Useful visual reference for any AI briefing.