News
AI, at newsroom pace.
RESEARCH
BrowserBC Lifts Browser Agent Success to 81% Using Human Traces
RESEARCH
Language Model Explanations Track Behavior Shifts Automatically
RESEARCH
Simple Prompting Baselines Outperform Complex Supervision Methods
RESEARCH
OpenAI releases GeneBench-Pro; tests AI judgment on 129 multi-stage genomics problems; GPT-5.6 Sol reaches 31.5%
RESEARCH
Google Releases Zero-Shot Tabular Model but Hides Benchmark Data
RESEARCH
Vision-language models route knowledge through just 2.5% of network
RESEARCH
AI Agents Double Repository-Level Merge Friction
RESEARCH
Original-Language Context Recovers Accuracy Lost in Multilingual Cascades
RESEARCH
Zhipu GLM 5.2 closes gap on Claude Opus 4.8; open-weight coding enters frontier tier
RESEARCH
ChatGPT crosses 1 billion monthly active users, fastest consumer app milestone in history
RESEARCH
TRIAGE Cuts Agent Actions 14.8% While Raising Success Rates
RESEARCH
Researchers Close Gap Between AI Agents and Hand-Curated Skills
RESEARCH
New Training Technique Improves LLM Confidence Calibration by 63%
RESEARCH
Artificial neuron on silicon chip discovered; mimics brain efficiency, could slash AI power use
RESEARCH
DeepSeek V4 DSpark speculative decoding cuts inference latency 85%, hits Together AI
RESEARCH
OpenAI launches GPT-5.6 Sol family with government-gated access; leads TerminalBench at 91.9%
RESEARCH
GLM-5.2 from Chinese startup Z.ai beats GPT-5.5 on coding at 1/6th cost
RESEARCH
ENS Hits 10× Accuracy on Tough PDE Benchmarks Without Correction Loops