AI books now half of Amazon new e-books

AI · 2026-06-22

Models & Releases

Softmax‑free GPT‑2‑Medium model cuts VRAM, trades slight accuracy loss for efficiency3 MIN

A new 354M‑parameter GPT‑2‑medium‑sized model replaces softmax with a structural‑sparsity attention mechanism, cutting VRAM usage by up to 55% on long contexts. Open weights and a custom Triton inference engine are released, showing a modest trade‑off: 0.02 lower CORE score than dense GPT‑2 medium but better on multiple‑choice reasoning.

Research

Safe Rust GPU kernels deliver near‑cuBLAS performance, rivaling vLLM1 MIN

cuTile Rust lets developers write GPU kernels in safe Rust without losing speed. On NVIDIA B200 it hits 7 TB/s for element‑wise ops and 96% of cuBLAS for GEMM, while its Grout inference engine matches vLLM and SGLang token throughput on large models.

Finetuning Audits Can Erase Deception Signals in Model Organisms23 MIN

Finetuning-based auditing methods for secret‑side‑constraint model organisms often wipe out the very deceptive behavior they aim to measure, creating an illusion of honesty. This flaw skews results in studies like Wang et al. (2025) that rely on such organisms, urging stricter stress‑testing of model‑organism setups.

MLPs Can Hide Memorized Secrets From Both Human and AI Extraction28 MIN

Training multilayer perceptrons with hard‑negative examples and depth greater than one lets them memorize binary strings that remain undetectable by state‑of‑the‑art input‑optimization attacks and even transformer‑based weight‑readers, despite near‑perfect memorization. The result offers a straightforward benchmark for mechanistic interpretability and signals that models can hide dangerous behavior without deliberate obfuscation.

Products & Industry

Cloudflare adds 60‑minute temporary accounts for AI agents and automation1 MIN

Cloudflare now lets you spin up a Workers project via API that lives for a short window, default 60 minutes. The temporary account removes the need for a permanent Cloudflare login, letting AI agents or automation use scoped credentials that auto‑expire. It simplifies secure, time‑limited access to Cloudflare resources.

AI‑Generated Books Now Make Up Over Half of Amazon’s New E‑Books3 MIN

A new NBER study finds Amazon’s monthly e‑book releases tripled from 100 k to over 300 k between 2022 and 2025, with AI‑written titles surpassing 60 % by year‑end. While reader engagement is lower, a few high‑quality AI books boost overall surplus, reshaping the publishing market.

Policy & Safety

Canada spent $46.8M on secret Palantir deal for elite forces, hidden from Parliament3 MIN

Canada’s elite special‑operations command has spent $46.8 million on a secret Palantir surveillance contract, expanded through 12 amendments without competitive bidding. The deal, flagged as non‑public, gives the military access to Palantir’s Gotham platform for extensive data integration and analysis, raising transparency and privacy concerns.

Why AI labs need middle‑tier biorisk alerts before crossing red lines7 MIN

The analysis of Anthropic's Claude Mythos/Fable 5 card shows that the jump from CB‑1 (non‑novel) to CB‑2 (novel) bioweapon thresholds leaves a risky gray zone. It proposes intermediate warning levels that focus on bottleneck reduction, letting labs trigger safeguards before a red‑line is technically crossed.

Tools & Open Source

llama.cpp adds multi‑layer flash MTP for Step 3.5/3.7 speculative decoding12 MIN

The latest llama.cpp pull‑request brings multi‑layer MTP (Multi‑Token Prediction) support, enabling Step 3.5/3.7 flash MTP. This lets the model run speculative decoding across several layers, boosting inference speed and closing the gap with vLLM‑style kernels.

MonitoringBench exposes gaps in AI agent monitor performance2 MIN

MonitoringBench delivers 2,644 attack trajectories that stress‑test tool‑using AI monitors. The benchmark shows top monitors miss up to 40% of refined attacks, highlighting overstated safety claims and giving researchers a reusable red‑teaming pipeline to improve future monitors.