PostgreSQL 19 Beta Ships Autoscaling I/O

Data · 2026-06-08

Data Engineering

Keep Taxonomies and Ontologies Separate for Effective Knowledge Graphs8 MIN

Taxonomies capture human‑readable concepts, while ontologies define formal classes, properties, constraints, and inference rules. Vector‑search benefits from rich taxonomy text, whereas reasoning relies on ontology axioms. Linking them lets you combine intuitive browsing with machine reasoning without conflating the two layers.

Choosing Broker‑Visible vs Client‑Local Parallelism for Kafka Share Groups4 MIN

The post explains that parallelism in Kafka share groups can be managed either by the broker (using more partitions or consumers) or within the client (using async tasks, virtual threads, or internal queues). Broker‑visible parallelism scales with consumers and partitions, increasing broker state, while client‑local parallelism keeps fewer connections and offloads work to the client.

Analytics & Visualization

Execs Slash Tableau Spending, Citing Cost Over Value6 MIN

Data leaders report a wave of Tableau removals driven by executive concerns over expense and the belief that BI adds little beyond dashboards. Migration proves disruptive and costly, prompting advice to safeguard essential BI‑only metrics and explore cheaper, consolidated analytics platforms.

ML & AI for Data

Temperature Scaling Simplifies LLM Calibration, but Platt and Isotonic Offer Trade‑offs7 MIN

The article explains three post‑hoc calibration methods, temperature scaling (simple, effective), Platt scaling (fast, data‑efficient but coarse), and isotonic regression (most flexible and accurate), and the challenges of adapting them to large language models, whose massive output spaces and limited token‑probability access complicate standard approaches. It also covers calibration metrics such as Expected Calibration Error and reliability diagrams.

Wes McKinney warns against ‘vibe coding’ and champions agentic engineering for safe AI data work14 MIN

Wes McKinney warns that “vibe coding”, relying on one‑shot prompts and shipping code without review, poses serious risks for data‑critical applications. He advocates “agentic engineering”, where humans rigorously define specs, test, and guide AI agents, keeping humans in the loop to ensure trustworthy outcomes.

Databases & Storage

PostgreSQL 19 Beta 1 released with autoscaling I/O and parallel autovacuum6 MIN

The PostgreSQL Global Development Group announced the first beta of PostgreSQL 19, featuring autoscaling asynchronous I/O workers, parallel autovacuum, and up to 2× faster foreign‑key inserts. New capabilities include SQL/PGQ graph queries, restart‑free logical replication, and enhanced observability for production‑grade testing.

Join-Aware Materialized Views Close the Rewrite Gap for Star-Schema BI5 MIN

The article explains how join‑aware materialized views retain fact‑to‑dimension joins, enabling automatic query rewrite for common star‑schema dashboards. By contrast, single‑table MVs miss grouping attributes and cannot accelerate such queries. It surveys support across major warehouses (StarRocks, BigQuery, Redshift, Oracle) and the underlying optimizer complexity.

Practice & Datasets

Amazon proposes an 'audit‑then‑score' workflow to keep AI fact‑checking benchmarks dynamic4 MIN

Amazon’s AGI team shows that static benchmark labels fail for AI‑generated research reports. Their audit‑then‑score protocol lets models challenge expert labels, with human auditors reviewing disputes and updating ground truth. The approach is accompanied by two new datasets for evaluating deep‑research factuality.