Grok 4's smartest claim unravels, cost spiral threatens AI
xAI touts Grok 4 as the world’s smartest AI, citing record ARC‑AGI scores. A deep‑dive finds the model uneven, strong on benchmark‑like tasks but undercooked elsewhere, and the headline‑grabbing numbers are inflated. Investors may get a PR win, but developers should temper expectations.
Google retrofitted its frozen Gemini Nano v3 models with Multi-Token Prediction, letting the phone generate several tokens per forward pass instead of one. The change cuts latency and power use on Pixel 9/10, making on‑device features like notification summaries and proofread feel instant without extra memory‑hungry draft models.
Power-law distributions seen in neural network weights, activations, and attention can be traced to generalized central limit theorem universality, positioning heavy‑tailed spectra as a natural bridge between sparsity and Gaussianity. The paper argues this mechanism drives networks toward sparse, factored representations, offering a statistical‑physics lens on modern deep‑learning empirics.
CEO Jensen Huang announced that the U.S. government will grant the licenses needed for NVIDIA to ship its H20 data‑center GPUs to China again. At the same briefing he introduced a new RTX PRO GPU built to meet Chinese compliance for industrial AI workloads such as digital twins and logistics.
Dean Ball argues that massive training expenses for frontier models are recouped in just months, after which newer models undercut prices, creating an unsustainable race. He warns the current U.S. licensing regime lacks clear safety standards, leaving future releases in limbo and amplifying economic and security risks.
A tiny C program lets you run Qwen 3 4B (and smaller) on a plain CPU, no GPU, no Python, no libraries. It loads FP16 weights, does FP32 accumulation, includes its own tokenizer and generation loop, and compiles in a single command. With ~6 GB RAM it’s usable for short chats on Linux or macOS.
Subscribe free