§ BEAT
Research
Researchers Close Gap Between AI Agents and Hand-Curated Skills
AI Agents Double Repository-Level Merge Friction
Open-Weight Pipeline Achieves 68% Accuracy Extracting Political Networks from News
OpenThoughts-Agent Dataset Hits 44.8% on Agentic Benchmarks
Moebius Model Reaches Browser via ONNX+WebGPU in Parallel Agent Session
Princeton Releases LOCUS, Machine-Readable Corpus of 9,239 US Local Ordinances
Single Dense Model Hosts Hundreds of Agent Personas as Lightweight Masks
Component Interaction, Not Quality, Determines Agent Performance
Agents-K1 Replaces RAG Text Chunks With Typed Scientific Knowledge Graphs
Tahoe Text-to-SQL System Cuts Compiler Feedback by 96%
EEVEE Surpasses Self-Improving Agents with 48% Margin on Multi-Domain Inference
Piper Compiler Eliminates Hand-Coding for Distributed Training
FASE Cuts Hallucination Detection to 333x Speed
SIGA Speeds Coding Agents on Scientific Simulators by 36×
Output Format Drives Faster Accuracy Loss Than Domain Shift in Multimodal LLMs
GPIC Open-Source Dataset Displaces ImageNet-1K as Standard Training Corpus
Omega-QVLA Cuts Robot Vision Model Memory by 71% Without Retraining
Production Hardware Tests Needed Before OFT Replaces LoRA at Scale
Schema.org Metadata Cuts Agentic Retrieval Errors by Two-Thirds
Meta Shrinks Mixture-of-Experts to Smartphones Without Cloud Offloading
IBM Framework Classifies Code Changes at 84% Recall
Five Bugs Killed agentmemory in Seven Days
Microsoft's SkillOpt Lifts Agent Accuracy 24 Points via Automated Skill Refinement
NVIDIA's CARV cuts 3D distillation compute by 2–3×
Allen AI's OlmoEarth v1.1 cuts satellite inference compute 3x
Autonomous Disease Forecasting System Outperforms CDC Ensemble on Blinded Tests
Grep Beats Vector Search in Inline Agent Retrieval
TFlow cuts multi-agent inference tokens 83% via weight injection
IBM Boosts Zero-Shot Search Accuracy 25% With LLM Query Refinement