Netflix kills Kafka, MCP reshapes BI, Sail 0.3 speeds Spark
Pinterest’s Big Data Platform team built Moka, a Spark‑on‑EKS service that runs batch Spark jobs in containers on AWS. By moving from Hadoop to Kubernetes, Moka promises better performance, lower ops costs, and easier scaling for non‑sensitive data workloads.
Netflix swapped Tudum's event‑stream backbone from Kafka to Raw Hollow, a snapshot‑oriented data format that pushes up‑to‑date content directly to client devices. The move cuts latency, simplifies the read path, and scales more predictably for millions of monthly fans.
Sail 0.3 replaces the Java Spark server with a Rust‑native implementation that speaks the Spark Connect protocol. It supports Spark 4.0 and 3.5, cuts object‑store latency, and ships a lightweight PySpark client, giving data teams faster, cheaper batch and streaming workloads without code changes.
Model Context Protocol (MCP) lets LLMs like Claude hook straight into PDFs, databases and BI tools via a single open spec. That eliminates custom connectors, letting AI fetch and visualize data on the fly, which could sideline many visualization and reporting roles.
Google Research unveiled frozen Multi-Token Prediction (MTP), a retrofit that speeds up Gemini Nano on Pixel phones without fine‑tuning separate draft models. By integrating a lightweight transformer into the frozen model, MTP cuts inference latency and power use, making on‑device AI features like notification summaries and proofreading snappier and more battery‑friendly.
A tiny C++ daemon multiplexes transformer layers and enforces admission control so three distinct LLMs, SmolLM, Qwen2, and Llama, share an 8 GB GTX 1080 without out‑of‑memory crashes. The trick lets developers run parallel agents on legacy GPUs, saving costly upgrades.
Binding processes to the correct NUMA nodes can double PyTorch throughput on multi-socket servers. The article shows how NUMA‑aware CPU‑GPU coordination reduces memory latency and boosts training speed, turning a common bottleneck into a performance lever for large‑scale deep learning workloads.
RAG benchmarks often become accidental training sets when developers tweak models based on the same test queries. This hidden overfitting inflates scores but masks real retrieval failures, meaning deployed systems may miss critical information. The post shows how to break the cycle and keep evaluation truly unseen.
RudderStack proved PostgreSQL can sustain 100,000 events per second as a streaming queue, sidestepping Kafka. They tamed table bloat, rewrote indexing, and managed retry storms, turning a simple relational DB into a high‑throughput, resilient backbone. The playbook shows teams can leverage existing SQL skills to cut ops complexity while scaling massive event pipelines.
ClickHouse Cloud now runs compute without any local disks, using a new in‑memory engine backed by a Shared Catalog that centralizes metadata. This eliminates warm‑up, enables instant elastic scaling, and adds atomic DDL features like cross‑db renames and UNDROP, boosting reliability and speed for cloud workloads.
Subscribe free