Claude models hallucinate parameters, break Pi contracts
shadcn/ui now ships new projects with Base UI as the default component library, moving away from its original Radix foundation. The switch is official but Radix remains supported, so existing apps can stay as‑is while new builds benefit from Base UI’s stability and growing ecosystem.
The authors introduce ActiveGraph, a runtime where an append‑only event log is the single source of truth for an agent. By projecting a deterministic graph from the log, agents can replay any run, branch at any point without re‑executing shared steps, and trace every output back to the exact model call.
Current AI unveiled an interactive Open Source AI Gap Map that benchmarks open‑source models against closed‑source peers across modalities and benchmarks. The map highlights where community projects lag, guiding contributors to target the most critical capability gaps. It gives funders, developers, and researchers a clear view of where to invest effort.
A WebAssembly build of KiCad lets you design schematics and PCBs directly in the browser, no install required. The demo shows the full editor, from schematic capture to layout, proving that professional-grade electronics design can be done from any device.
Simon Willison used Claude Fable to generate the bulk of sqlite-utils 4.0rc2, spending roughly $149 on API calls. Over 37 prompts the AI fixed a critical transaction bug and made dozens of changes across 30 files, proving AI can drive a production‑ready open‑source release.
Anthropic’s latest Claude models (Opus 4.8, Sonnet 5) sometimes add made‑up fields to Pi’s edit tool calls, causing schema validation failures. The bug shows that bigger models can be less reliable for tool integration, warning developers that model upgrades aren’t always a net gain for automation pipelines.
OpenAI’s internal telemetry shows GPT‑5.5 Codex responses clustering at exactly 516 reasoning tokens, with spikes at 1034 and 1552. This fixed‑token pattern coincides with a sharp drop in average reasoning‑token usage and a measurable regression in complex refactoring and multi‑file code generation. The anomaly suggests a hidden budgeting or truncation mechanism affecting performance.
Claude Fable can weigh two implementation options better than it can write code from scratch. By letting the model decide when to run tests or pick a lower‑power sub‑model, teams save tokens and keep costly reasoning in the main loop.
The 2003 Command & Conquer: Generals engine has been recompiled for ARM64, letting it run natively on Apple Silicon Macs, iPhone and iPad with touch‑friendly controls. No emulation is involved, DirectX 8 is translated via DXVK → Vulkan → MoltenVK → Metal, proving a practical roadmap for porting classic Windows games to Apple platforms.
The new DMARC np tag, meant to enforce policy on non‑existent subdomains, collides with RFC 9824 DNSSEC's Compact Denial of Existence, causing receivers to ignore np when DNSSEC is enabled. This breaks intended subdomain protection for domains on Cloudflare, AWS, Azure, and other DNS providers, and the IETF has yet to resolve it.
Iwo Kadziela and Codex compressed an ASCII world map down to 445 bytes of data by stripping water dots, cropping margins, and exploiting repeated land characters for better deflate compression. The surrounding HTML stays under 1 KB, proving extreme data compression can still yield recognizable visualizations. It highlights representation choices over raw detail for tiny assets.
Subscribe free