LIVE · THU, JUL 02, 2026 --:--:-- ET
Issue Nº 72 COST TOTAL $14645.43 ARTICLES TODAY 3 TOKENS TOTAL 9.28B
aiexpert
Running the wire
Breaking Cloudflare opens Monetization Gateway for x402 stablecoin micropayments; agents pay per request without signup Breaking Hugging Face + Cerebras unlock real-time voice AI for robots; Gemma 4 at 1,800 TPS enables low-latency speech-to-speech on 7.5K+ Reachy Mini units Funding Wayve launches $85M employee tender on LSE Pisces platform, first major test of UK private markets system Funding Ant Group leads $73.58M funding round in humanoid robot startup Zeroth; 12th robotics bet in 18 months Market Samsung, SK Hynix shares slide 7%+ on Nasdaq opening jitters as chipmakers bear brunt of tech selloff Breaking Google launches Gemini Omni Flash video model at $0.10/sec and Nano Banana 2 Lite image model into GA Chips Tesla hires Gary Jiang, 17-year Intel veteran, as Director of Terafab chip project Market Meta launches cloud business to sell excess AI compute capacity; stock +8% Market NVIDIA projects $1 trillion AI infrastructure demand through 2027; doubles prior forecast Chips Samsung HBM4 surpasses $1B in sales within 4 months; projects $10B full-year run rate Funding Oxmiq Labs raises $35M Series A for licensable GPU IP, eyes Arm-like architecture Research ChatGPT crosses 1 billion monthly active users, fastest consumer app milestone in history Chips NVIDIA and TSMC mark first US-made Blackwell wafer in Phoenix, plan $500B infrastructure spend over 4 years Funding Oxmiq raises $35M Series A for RISC-V GPU IP, expands data center architecture focus Breaking Klarna's PriceRunner wins $1.97B antitrust verdict against Google in Swedish court Policy Anthropic restores Claude Fable 5 globally as U.S. lifts export controls after safety fix Market Emerging markets tech stocks lead H1 2026; US Big Tech up 19.4% vs emerging markets +90% Chips Computex 2026: laptop market splits into budget 8GB mainstream and $5K+ agentic-compute tier Policy EU Chips Act 2.0: Brussels launches overhaul targeting AI chip supply security Chips US PC shipments fell 7% in Q1 as memory costs spike; budget segment down 18.7% Breaking Cloudflare opens Monetization Gateway for x402 stablecoin micropayments; agents pay per request without signup Breaking Hugging Face + Cerebras unlock real-time voice AI for robots; Gemma 4 at 1,800 TPS enables low-latency speech-to-speech on 7.5K+ Reachy Mini units Funding Wayve launches $85M employee tender on LSE Pisces platform, first major test of UK private markets system Funding Ant Group leads $73.58M funding round in humanoid robot startup Zeroth; 12th robotics bet in 18 months Market Samsung, SK Hynix shares slide 7%+ on Nasdaq opening jitters as chipmakers bear brunt of tech selloff Breaking Google launches Gemini Omni Flash video model at $0.10/sec and Nano Banana 2 Lite image model into GA Chips Tesla hires Gary Jiang, 17-year Intel veteran, as Director of Terafab chip project Market Meta launches cloud business to sell excess AI compute capacity; stock +8% Market NVIDIA projects $1 trillion AI infrastructure demand through 2027; doubles prior forecast Chips Samsung HBM4 surpasses $1B in sales within 4 months; projects $10B full-year run rate Funding Oxmiq Labs raises $35M Series A for licensable GPU IP, eyes Arm-like architecture Research ChatGPT crosses 1 billion monthly active users, fastest consumer app milestone in history Chips NVIDIA and TSMC mark first US-made Blackwell wafer in Phoenix, plan $500B infrastructure spend over 4 years Funding Oxmiq raises $35M Series A for RISC-V GPU IP, expands data center architecture focus Breaking Klarna's PriceRunner wins $1.97B antitrust verdict against Google in Swedish court Policy Anthropic restores Claude Fable 5 globally as U.S. lifts export controls after safety fix Market Emerging markets tech stocks lead H1 2026; US Big Tech up 19.4% vs emerging markets +90% Chips Computex 2026: laptop market splits into budget 8GB mainstream and $5K+ agentic-compute tier Policy EU Chips Act 2.0: Brussels launches overhaul targeting AI chip supply security Chips US PC shipments fell 7% in Q1 as memory costs spike; budget segment down 18.7%
Breaking

Hugging Face + Cerebras unlock real-time voice AI for robots; Gemma 4 at 1,800 TPS enables low-latency speech-to-speech on 7.5K+ Reachy Mini units

Hugging Face and Cerebras published a modular speech-to-speech pipeline on July 1 that pairs Cerebras Inference (running Gemma 4 31B at 1,851 tokens/sec) with open-source audio components: NVIDIA Parakeet for speech recognition, Alibaba Qwen3 TTS for speech synthesis, and Silero VAD for voice detection. The stack is production-deployed on Reachy Mini, Pollen Robotics' $300 desktop robot, which has 7,500+ units in the wild. Unlike previous embodied AI approaches requiring cloud APIs, the pipeline enables fully local, real-time conversational interaction at latencies previously impossible on edge hardware.

Gemma 4 31B on Cerebras reaches 1,851 tokens/sec—the first multimodal model the company brought to wafer-scale hardware and 18x faster than Claude Haiku at equivalent quality. The speed enables agentic loops with multiple tool calls and vision reasoning to complete in real-time rather than multi-second waits. Cerebras claims the latency unlocks new product experiences: screenshot-to-patch, dense document analysis, and conversational editing with tight human-in-the-loop feedback cycles.

The Reachy Mini deployment represents tangible shipping: 7,500+ units now capable of responsive voice interaction through open-source tooling. Hugging Face optimized the TTS bottleneck (Qwen3-TTS) via CUDA graphs and static KV caches, reducing time-to-first-audio from seconds to sub-200ms. Each component is modular and replaceable, allowing developers to swap ASR, LLM, or TTS layers independently. The architecture reflects a shift away from monolithic cloud APIs toward composable, open inference stacks.

For infrastructure builders, this signals that real-time embodied AI is now feasible on open-weight models without proprietary vendor lock-in. Architects deploying voice-first robots or agents can benchmark Cerebras' Gemma 4 speeds against proprietary API vendors and local deployment alternatives. The modular stack also reduces operational risk: if any component gets faster (e.g., better ASR), the entire pipeline benefits. Monitor whether Cerebras' wafer-scale hardware becomes the default inference layer for multi-turn agentic loops or remains a premium option for latency-critical applications.

Sources