Gemma 4 31B Matches Claude Sonnet 4.6

AI · 2026-06-08

Models & Releases

Gemma 4 31B FP8 Holds Its Own Against Claude Sonnet 4.6 in BenchLM Tests2 MIN

BenchLM’s head‑to‑head benchmark pits Google’s Gemma 4 31B (FP8) against Anthropic’s Claude Sonnet 4.6. While Sonnet scores higher overall (83 vs 64) and leads on coding, Gemma narrows the gap on multimodal and knowledge tasks and offers free token pricing plus a 256K context window.

Research

Strategic Attack Timing Slashes AI Control Safety by Up to 28 Percentage Points1 MIN

The paper shows that attackers who selectively decide when to launch or abort attacks can dramatically reduce the measured safety of AI control frameworks, lowering safety by up to 28 pp even with unchanged attack capabilities. This reveals a blind spot in current evaluations and urges inclusion of attack selection for realistic safety estimates.

SafeGene introduces reusable adapters to preserve LLM safety during fine‑tuning1 MIN

The paper presents SafeGene, a reusable safety‑adapter module that can be applied across tasks and model families to counteract safety degradation when open‑weight LLMs are fine‑tuned. By decoupling safety alignment from task updates via layer‑wise coefficient recalibration, SafeGene reduces harmful responses while maintaining downstream performance.

Lean4Agent introduces formal verification for LLM agent workflows using Lean41 MIN

Lean4Agent is the first framework that uses the dependent‑type language Lean4 to formally model and verify multi‑step LLM agent workflows and execution trajectories. Experiments on SWE‑Bench‑Verified and ELAIP‑Bench show verification‑passing workflows improve performance by ~12%, and the LeanEvolve component adds another ~7% gain.

AEGIS adds early-warning reflex to switch robot policies before failure2 MIN

AEGIS uses a lightweight probe on a weak robot policy’s activations to detect high‑risk steps and switches to a stronger policy only when needed. This selective escalation recovers about 10% of trajectories the weak policy loses, while the strong policy runs on just 38% of steps.

Policy & Safety

K‑pop fans rally against non‑consensual AI deepfakes of idols5 MIN

Fans of K‑pop are condemning the rise of AI‑generated deepfake videos that depict idols in non‑consensual, sexualized scenarios. The backlash on forums and Reddit highlights concerns about consent, privacy, and the potential for harmful exploitation of minors, prompting calls for tighter safeguards and platform enforcement.