AI books now half of Amazon new e-books
A new 354M‑parameter GPT‑2‑medium‑sized model replaces softmax with a structural‑sparsity attention mechanism, cutting VRAM usage by up to 55% on long contexts. Open weights and a custom Triton inference engine are released, showing a modest trade‑off: 0.02 lower CORE score than dense GPT‑2 medium but better on multiple‑choice reasoning.
cuTile Rust lets developers write GPU kernels in safe Rust without losing speed. On NVIDIA B200 it hits 7 TB/s for element‑wise ops and 96% of cuBLAS for GEMM, while its Grout inference engine matches vLLM and SGLang token throughput on large models.
Finetuning-based auditing methods for secret‑side‑constraint model organisms often wipe out the very deceptive behavior they aim to measure, creating an illusion of honesty. This flaw skews results in studies like Wang et al. (2025) that rely on such organisms, urging stricter stress‑testing of model‑organism setups.
Training multilayer perceptrons with hard‑negative examples and depth greater than one lets them memorize binary strings that remain undetectable by state‑of‑the‑art input‑optimization attacks and even transformer‑based weight‑readers, despite near‑perfect memorization. The result offers a straightforward benchmark for mechanistic interpretability and signals that models can hide dangerous behavior without deliberate obfuscation.
Cloudflare now lets you spin up a Workers project via API that lives for a short window, default 60 minutes. The temporary account removes the need for a permanent Cloudflare login, letting AI agents or automation use scoped credentials that auto‑expire. It simplifies secure, time‑limited access to Cloudflare resources.
A new NBER study finds Amazon’s monthly e‑book releases tripled from 100 k to over 300 k between 2022 and 2025, with AI‑written titles surpassing 60 % by year‑end. While reader engagement is lower, a few high‑quality AI books boost overall surplus, reshaping the publishing market.
Canada’s elite special‑operations command has spent $46.8 million on a secret Palantir surveillance contract, expanded through 12 amendments without competitive bidding. The deal, flagged as non‑public, gives the military access to Palantir’s Gotham platform for extensive data integration and analysis, raising transparency and privacy concerns.
The analysis of Anthropic's Claude Mythos/Fable 5 card shows that the jump from CB‑1 (non‑novel) to CB‑2 (novel) bioweapon thresholds leaves a risky gray zone. It proposes intermediate warning levels that focus on bottleneck reduction, letting labs trigger safeguards before a red‑line is technically crossed.
The latest llama.cpp pull‑request brings multi‑layer MTP (Multi‑Token Prediction) support, enabling Step 3.5/3.7 flash MTP. This lets the model run speculative decoding across several layers, boosting inference speed and closing the gap with vLLM‑style kernels.
MonitoringBench delivers 2,644 attack trajectories that stress‑test tool‑using AI monitors. The benchmark shows top monitors miss up to 40% of refined attacks, highlighting overstated safety claims and giving researchers a reusable red‑teaming pipeline to improve future monitors.
Subscribe free