TOGETHER AI

Introduction

Artificial intelligence (AI), especially in its generative forms, has transcended academic curiosity to become a cornerstone of contemporary technology ecosystems. The contemporary AI paradigm emphasises not only algorithmic performance but also the infrastructure that enables scalable, efficient and secure execution of large models. This infrastructure encompasses computational hardware, software stacks tailored for distributed processing, developer interfaces, governance frameworks and multi-modal support for text, image and audio modalities.

In this context, Together AI stands as a paradigmatic example of a modern AI infrastructure provider. Established in 2022 with a mission to support the entire AI lifecycle via cloud-based services, Together AI quickly carved a niche at the intersection of open-source advocacy and high-performance cloud infrastructure. It offers a platform that enables developers, researchers and enterprises to train, fine-tune, deploy and run generative AI models with enhanced performance, cost-effectiveness and flexibility compared to traditional cloud or proprietary alternatives.

This paper explores how Together AI has articulated and operationalised its infrastructure vision, the historical forces shaping its emergence, its key technological innovations and the broader implications of its work. By doing so, it contributes to a deeper understanding of how AI infrastructure is evolving in the early 21st century and the role of emerging companies in shaping that evolution.

Founding and Early Vision

Together AI was founded in 2022 in San Francisco, California, by a team of engineers and researchers with academic backgrounds at institutions such as Stanford University and the Massachusetts Institute of Technology. The founders, including Vipul Ved Prakash, aimed to build an infrastructure platform optimised for modern generative AI workloads, particularly those rooted in open-source models, which were growing in both capability and community support.

From its inception, the company’s orientation was distinct from many traditional AI infrastructure providers. Rather than anchoring services exclusively around proprietary models or closed services, Together AI committed to an open-source ethos, supporting extensive libraries of open models and enabling developers to retain control over architectures and data. This emphasis positioned Together AI as a platform that sought to lower barriers to entry for AI innovation, widening access beyond large hyper-scale cloud users to individual developers, startups and enterprises.

Financing and Strategic Growth

The company’s early growth trajectory was underpinned by substantial venture financing, beginning with a Series A funding round of approximately USD 102.5 million in 2023, led by Kleiner Perkins, followed by additional rounds led by Salesforce Ventures and other prominent investors. This early funding laid the foundation for infrastructure build-outs and platform development.

In February 2025, Together AI announced a Series B financing round of USD 305 million, led by General Catalyst with co-investors such as Prosperity7 Ventures and NVIDIA, valuing the enterprise at an estimated USD 3.3 billion. This round was explicitly intended to support the scaling of cloud infrastructure, including large-scale deployment of NVIDIA’s Blackwell GPUs and solidified the company’s positioning as a key player in the AI infrastructure domain.

The scale of this investment reflected investor confidence in Together AI’s proposition: a vertically integrated platform spanning training, inference and deployment while anchored in open-source model support. It also underscored a broader industry recognition of infrastructure as a strategic frontier in the AI economy, one that extends beyond proprietary services to shared computing ecosystems.

Core Cloud Infrastructure

At the core of Together AI’s infrastructure offering are its cloud-based GPU clusters, designed specifically for AI workloads. These clusters are built on high-performance hardware such as NVIDIA Blackwell GPUs (e.g., GB200 NVL72 and HGX B200 systems) which are optimised for both training and inference of large models.

Together AI’s infrastructure is engineered for high throughput and low latency, with custom software stacks, high-speed interconnects (such as InfiniBand and NVIDIA NVLink) and advanced cooling systems, features that are essential for efficient large-scale training and real-time inference. According to company documentation, these systems can scale from thousands to over 100,000 GPUs across distributed global data centres, enabling work on models at industrial scale.

This hardware foundation is complemented by infrastructure management tools that utilise technologies such as Kubernetes and Slurm, enabling dynamic workload orchestration and efficient resource utilisation. Such orchestration layers are critical for optimising performance and enabling elastic scaling as workload demands vary.

Software Optimisation and Performance Engineering

Beyond raw compute resources, Together AI’s platform incorporates significant software optimisation aimed at accelerating model training and inference while improving cost efficiency. Central to this software stack is the Together Kernel Collection, a suite of customised CUDA kernels developed to enhance performance for common AI operations across supported models. These innovations reportedly enable reduced training times and improved throughput compared to baseline frameworks.

The platform also frequently cites FlashAttention-3 kernels and other research insights that improve memory efficiency and computational throughput, essentially ensuring that GPU resources are utilised with high efficiency, which is a non-trivial challenge for large language model workloads that involve massive parameter matrices and extensive tensor computations.

Together AI’s infrastructure is designed not only to deliver high performance but also to support multiple stages of the AI lifecycle, training, fine-tuning and inference, through unified APIs and integrated workflows. This approach is central to the company’s AI Acceleration Cloud vision: a platform where organisations can move seamlessly from prototyping to production deployment without substantial re-engineering.

Enterprise Platform and Secure Deployment

Recognising the varied deployment preferences of enterprise users, Together AI developed the Together Enterprise Platform, which allows customers to run generative AI securely in any environment with enhanced control over data and costs. This platform includes advanced optimisation features such as continuous model refinement, automated resource tuning and compatibility with on-premises infrastructure, offering up to 2-3 × faster inference with significant cost savings when compared to traditional cloud infrastructure.

Crucially, this enterprise platform enables confidential deployment options (such as on private clouds or within virtual private cloud environments) that keep proprietary data within corporate firewalls, an essential attribute for regulated industries with strict data sovereignty requirements.

RedPajama and Open-source Commitments

In April 2023, Together AI launched RedPajama, an open-source initiative aimed at building a large-scale training dataset as an alternative to Meta’s LLaMA dataset. This dataset, comprising over 1.2 trillion tokens, was developed to support state-of-the-art model training while aligning with the company’s open access philosophy.

By contributing to the open-source ecosystem, Together AI sought to reduce proprietary dependencies and encourage community participation in training and deploying competitive models. RedPajama served not only as a practical resource but also as a philosophical commitment to democratising AI development, addressing barriers associated with proprietary data and closed infrastructure.

Support for Open Models and Multi-modal Workloads

Together AI’s platform supports a wide range of open-source models across multiple modalities, including text, image, audio and code. This contrasts with many traditional AI services that primarily promote proprietary or closed-source models as part of integrated commercial offerings. Through its API compatibility and robust infrastructure, Together AI enables users to deploy models such as Meta’s Llama, DeepSeek-generated models and others, thereby broadening the repertoire of accessible generative AI tools.

This multi-model support aligns with a larger trend in AI towards interoperability and choice, allowing developers and organisations to select tools that best fit their applications while avoiding vendor lock-in.

European Expansion and Regional Infrastructure

In 2025, Together AI announced and initiated a strategic infrastructure expansion into Europe via partnerships with companies such as Hypertec and 5C Group. This endeavour aimed to deploy up to 2 gigawatts of dedicated data centre capacity and approximately 100,000 NVIDIA GPUs across multiple European countries, including France, the United Kingdom, Italy and Portugal, by 2028. This expansion reflected both market demand and regulatory considerations, particularly the need for sovereign and regulation-ready AI infrastructure under European data protection regimes.

The European build-out demonstrates how Together AI’s infrastructure strategy extends beyond North America, embedding compliance with regional regulation and serving research institutions, enterprises and developers who require data sovereignty and high-performance computing close to end users.

The company has also activated infrastructure in locations such as Sweden, enabling the delivery of its server less inference APIs and supporting a growing community of global users. These deployments illustrate how Together AI’s physical infrastructure strategy, complemented by its cloud platform, aims to reduce latency, improve regional performance and comply with local data governance standards.

Adoption, Customers and Industry Position

Together AI’s platform is used by over 450,000 AI developers and a range of enterprises that span technology, media, telecommunications and services sectors. Key customers and partners include Salesforce, Zoom, SK Telecom and Zomato, testifying to the company’s appeal across diverse industries seeking to build generative AI applications.

This broad adoption reflects the value proposition of Together AI’s infrastructure: a unified AI lifecycle platform that offers efficient GPU use, open-source model support and enterprise deployment flexibility. Organisations can benefit from advanced automation, continuous model optimisation and scalable deployment options without investing in proprietary infrastructure.

Together AI operates in a competitive space characterised by the presence of hyper-scale cloud providers, third-party GPU cloud services and hybrid infrastructure platforms. Its positioning emphasises performance, openness and ecosystem integration as differentiators from large public cloud offerings. While hyperscalers provide broad multi-purpose services, Together AI’s specialised infrastructure claims 2-3 × faster inference and significant cost savings for certain workloads, a key competitive advantage in high-demand AI use cases.

Moreover, by aligning with the open-source movement, Together AI attracts users who prioritise transparency, model control and interoperability, cases where proprietary offerings may present constraints. The company’s support for widely used models across modalities serves to reinforce this position.

Challenges and Strategic Pressures

Despite its growth, Together AI faces fundamental challenges linked to the high capital intensity of AI infrastructure. Building and operating large GPU clusters entails significant energy consumption, capital expenditure and logistical complexity. Ensuring efficient utilisation of deployed GPUs and maintaining reliable uptime in distributed data centres are critical operational imperatives. Moreover, balancing infrastructure investment with sustainable cost structures remains a key concern as the company scales.

As Together AI expands into new markets, it must navigate varying regulatory environments, especially in Europe, where data protection and AI governance frameworks (e.g., the AI Act) impose stringent requirements on data processing and system transparency. The company’s emphasis on sovereign infrastructure deployment and compliance-ready platforms is a strategic response to these challenges, but ongoing legal and policy shifts may require adaptive infrastructure and governance strategies.

Emerging trends in AI, including specialised accelerators, decentralised compute models and edge AI architectures, may reshape the landscape of AI infrastructure. Together AI’s continued competitiveness will depend on its ability to innovate in software optimisation, interoperability with emerging compute hardware and integration of new AI paradigms.

Conclusion

Together AI represents a salient example of how contemporary AI infrastructure provision is evolving at the intersection of open-source ideals, high-performance hardware and cloud-based scalability. From its foundation in 2022 to its multi-billion-dollar valuation and global expansion, the company has articulated a compelling proposition: democratise access to generative AI by offering a platform that supports the full AI lifecycle, training, fine-tuning, inference and deployment, on performance-optimised infrastructure.

The company’s historical arc underscores the growing importance of specialised AI infrastructure in the broader technology ecosystem. Through strategic investments in hardware, vertically integrated software and an open-source ecosystem, Together AI contributes to reshaping how developers and enterprises build, scale and govern AI systems. At the same time, it highlights the enduring challenge of balancing infrastructure scale with operational sustainability and regulatory compliance.

Future research should continue to monitor how companies like Together AI influence the dynamics of AI infrastructure markets, the trade-offs between openness and control and the implications for innovation across sectors embracing generative AI.

FURTHER INFORMATION

This website is owned and operated by X, a trading name and registered trade mark of
GENERAL INTELLIGENCE PLC, a company registered in Scotland with company number: SC003234