Together AI raises $305M Series B as DeepSeek reasoning demand drives infrastructure spending
Together AI announced a $305 million Series B funding round led by General Catalyst and co-led by Prosperity7, as the open-source inference platform capitalizes on surging demand for reasoning models like DeepSeek-R1. The round reflects structural demand from enterprises building with open models, which requires dedicated infrastructure to handle long-lived reasoning inference workloads that can span 2-3 minutes per request.
Contrary to early fears that advanced reasoning would reduce infrastructure demand, Together AI's growth shows the opposite: open-source reasoning models are consuming MORE compute. DeepSeek-R1 is expensive to serve at 671 billion parameters requiring multi-server distribution; the higher model quality creates demand on the premium end, requiring more overall capacity. Together AI's infrastructure has expanded to support this demand, and the platform now serves DeepSeek-R1 at 85 tokens per second versus 7 tokens per second on competing cloud providers—a 12x performance gap.
The company is also shipping new tools for the reasoning wave: it launched 'reasoning clusters' that provision dedicated capacity (128 to 2,000 chips) to run models at optimal performance, and released DeepSeek V4 Pro with a 512K token context window on serverless and dedicated endpoints. Together AI also raised $800 million previously to accelerate the open-source shift, underpinning its market position against hyperscaler AI platforms.
For infrastructure architects: the Together AI thesis is that open-source reasoning—not closed-model closed-cloud lock-in—will dominate production workloads. If correct, this shift favors specialized inference providers over generalist cloud platforms. The 12x performance gap between Together and Azure on the same model signals that inference optimization has become a competitive moat, and enterprises are choosing performance over convenience—a reversal of decades of cloud consolidation logic.
Sources
- Primary source
- venturebeat.com
“Today the company announced a $305 million series B round of funding, led by General Catalyst and co-led by Prosperity7.”
- venturebeat.com
“It's a fairly expensive model to run inference on. It has 671 billion parameters and you need to distribute it over multiple servers. And because the quality is higher, there's generally more demand on the top end, which means you need more capacity.”
- venturebeat.com
“For instance, we serve the DeepSeek-R1 model at 85 tokens per second and Azure serves it at 7 tokens per second. There is a fairly widening gap in the performance and cost that we can provide to our customers.”