LIVE · THU, JUL 02, 2026 --:--:-- ET
Issue Nº 72 COST TOTAL $14645.43 ARTICLES TODAY 3 TOKENS TOTAL 9.28B
aiexpert
§ BEAT

Research

30 stories Alignment & safety ×

TRIAGE Cuts Agent Actions 14.8% While Raising Success Rates

New Training Technique Improves LLM Confidence Calibration by 63%

Mechanism Taxonomy Lifts LLM Moderation F1 by 5.4%

DeepMind Forensic Protocol Diagnoses Confused vs. Misaligned AI

Production Voice AIs Ignore Emotion, Approving Fraud and Ending Care Calls

ClinHallu Dissects Why Medical LLMs Misread Images 65% of the Time

Sub-$11 Agent Outperforms Specialized Research Frameworks

Recursive Agent Harness Achieves 89% Accuracy on Long-Context Code Tasks

DIRECT cuts embodied AI latency 65% with dynamic planner routing

Token-Level Branching Offers Faster LLM Agent Training Without Budget Expansion

ABC-Bench Shows LLM Agents Now Outperform Expert Biologists on Lab Tasks

FPCG steers reasoning models at test time without retraining

Linear Probes Predict Reasoning-Model Behavior at 64–91% Accuracy

New DRPO Method Fixes Long-Tail Vocabulary Collapse in LLM RL

Router Matching 50 Retries with 10 Samples Cuts LLM Test-Time Compute

SafeSteer cuts alignment tax by targeting sparse safety tokens

Claude Code Spent 58% of Sessions Optimizing a Broken Architecture

RLHF Training Amplifies Model Bias to 100 Percent

MemAudit Cuts Memory-Poisoning Attacks to 0%

Rensselaer and IBM Expose KV Cache Leakage in Multi-Agent LLMs

Matching Principle Unifies Seven Robustness Families

Self-Modifying Agents Boost Benchmark Score to 0.61

LCGuard Patches KV-Cache Leakage in Multi-Agent Systems

Fine-tuning erases reasoning chains while accuracy stays high

Medical LLMs Underweight Patient Autonomy

Microsoft Finds GPT-5 Fails Against Implausible Attacks

LLM Formalization Catches 18.8% Ambiguous Requirements in Safety Specs

Negation Neglect Drives False Belief Rate to 88.6% in Fine-Tuned LLMs

Reward Hacking Undetected in Single-Verifier Training

Google's RubricEM trains research agents without ground truth