
Introduction
Unlocking the full potential of artificial intelligence requires a robust, high-performance infrastructure. Imagine a cloud environment that can feed thousands of GPUs with massive data streams at unprecedented speeds. This is no longer a dream; it’s a reality. In collaboration with Google Cloud and IBM, Sycomp has delivered a production-ready, optimized AI platform capable of achieving one terabyte per second (1.2 TB/s) of IO throughput, revolutionizing AI workloads in the cloud.
Context/Background
The demand for advanced computing platforms that combine container efficiency, GPU processing power, and high-speed storage is skyrocketing. Traditional cloud solutions have struggled to deliver the necessary throughput to keep pace with intensive AI and machine learning (ML) workloads. In July 2024, a leading AI vendor challenged Google, Sycomp, and IBM to create a cloud-optimized platform capable of sustaining 1.2 TB/s of IO throughput for production workloads running on thousands of GPUs. This challenge spurred the development of a next-generation AI platform built on Google Cloud.
Use Cases and Challenges
Organizations across various sectors are leveraging AI to accelerate generative processes, enhance customer experiences through deep learning, and improve robotic vision. However, these applications demand immense computational resources and data bandwidth. The primary challenge has been finding an end-to-end platform that can support this level of advanced computing while ensuring reliability and performance. This includes addressing hardware lead times, data center limitations, and the complexities of high-speed networking.
Considerations and Tradeoffs
Other vendors may have achieved similar throughput levels in benchmarks, but true production readiness involves more than just raw speed. Sycomp Intelligent Data Storage Platform, built on Google persistent storage, is designed to withstand power outages and cloud vendor rack issues without data loss or performance degradation. The key consideration was to create a solution that is not only fast but also resilient and scalable, capable of meeting the demanding needs of real-world AI deployments.
Solution Details (Including Architectural Information)
The solution comprises a 128 storage server cluster with 50 NFS servers, supporting a Google Cloud Kubernetes Engine (GKE) cluster with 1280 high-performance A3 systems with H100 GPUs. The platform utilizes Google's latest generation of high-performance persistent disk and scalable networking. The solution was built on Sycomp Intelligent Data Storage Platform including a software-defined implementation with IBM Storage Scale and Real Insight Storage Engine (RISE). The solution was built on Sycomp Intelligent Data Storage Platform including a software-defined implementation with IBM Storage Scale and Real Insight Storage Engine (RISE). Sycomp’s contribution includes Sycomp Intelligent Data Storage Platform, a software-defined implementation with IBM Storage Scale, and Real Insight Storage Engine. This platform offers a high-performance global namespace, transparent data mobility, and concurrent data access via NFS and the IBM Storage Scale client. The GPFS object interface automatically copies checkpoints and data sets into a GCS bucket for durability. Phase one of the project consistently delivered 800 GiB/s of sequential throughput using NFS. Phase two of the project delivers in excess 1.2 TB/s of throughput with Storage Scale native clients in GKE with general availability later this summer with previews today.
Outcomes
The delivered platform accelerates AI solution delivery by eliminating hardware lead times and data center setup challenges. Sycomp Intelligent Data Storage Platform provides a high-performance global namespace, enabling AI pipelines to keep GPUs fed with the necessary data for training and inference. This led to accelerated model completion and decreased inferencing time. Specific benefits include improved GKE performance, enhanced GPU utilization (97% with single node throughput greater than 29 GB/sec), and the ability to deploy the solution in minutes. The platform delivers IBM Storage Scale as a managed service, offering flexibility in storage type, bandwidth, and data access methods.
Better Together
The collaboration between Google, Sycomp, and IBM has yielded a revolutionary solution that redefines the possibilities of AI in the cloud. By integrating Google Cloud's robust infrastructure, Sycomp Intelligent Data Storage Platform, and IBM’s parallel file system technology, customers can now create a seamless, high-performance environment for AI workloads.
Want to accelerate your AI solutions and experience the power of 1.2 TB/s throughput today? Contact us to learn how Google Cloud and Sycomp can transform your AI infrastructure and visit the Google Cloud and Sycomp websites for more information.
Other Resources

Sycomp Logistics and Integration Centers

Sycomp Earns Spot on CRN’s 2025 Solution Provider 500 List

Sycomp Boosts Google Cloud with Intelligent Data Storage Solution
