Rafay Debuts New Platform Capabilities to Speed Up GPU Infrastructure Consumption and Monetization

November 19, 2024 at 09:00 AM EST

New Capabilities Enable Self-service Consumption of Accelerated Computing Infrastructure in Addition to AI and ML Tooling for Cloud Providers and Enterprises

Rafay Systems, the leading provider of Platform-as-a-Service (PaaS) capabilities for cloud-native and GPU and AI consumption, today announced new platform advancements that help enterprises and GPU cloud providers deliver developer-friendly consumption workflows for GPU infrastructure. The new Rafay Platform capabilities include enterprise-grade controls, SKU definition, customer-specific policy enforcement and granular chargeback data. Enterprises investing in GPU-based infrastructure in data centers can leverage the Rafay Platform to roll out feature-rich enterprise-wide GPU clouds that developers and data scientists can consume on demand — complete with workbenches for model training, fine-tuning and inferencing. GPU cloud providers deploying GPUs for consumption by downstream customers can leverage the Rafay Platform to operate a full-featured, multi-tenant GPU PaaS that delivers both accelerated computing resources along with AI and ML tooling for training, tuning and serving large language models (LLMs).

GPU Investments Outpace Platform Team Bandwidth, Delaying AI Projects and Increasing Costs

Demand for accelerated computing infrastructure is at an all-time high. A majority of enterprises and service providers are investing in GPU hardware to meet generative AI application development demand. Whether they are buying hardware and deploying it in a data center, or committing to long-term leases with GPU cloud providers, there is urgency to provide developers and data scientists with this expensive hardware. Unfortunately, building a platform to enable self-service consumption of accelerated computing hardware and AI and ML workbenches can be a one to two year project. As a result of these platform development delays, expensive hardware is underutilized — nearly a third of enterprises are utilizing less than 15% of GPU capacity.

“Our work with customers across high-stakes industries over the last two quarters has revealed that enterprises and GPU cloud providers are running into similar challenges. Both are looking for ways to speed up the delivery of accelerated computing hardware to developers and data scientists,” said Haseeb Budhani, CEO and co-founder of Rafay Systems. “The new Rafay Platform capabilities address this need, helping enterprises and GPU cloud providers speed the delivery of a PaaS experience in order to monetize their significant investments in accelerated computing infrastructure.”

Rafay Accelerates GPU Monetization With Standardized Platform Building Blocks

With Rafay, GPU cloud providers and enterprises can quickly launch production-ready AI services. Platform teams can now deliver much-needed services to developers and data scientists through a PaaS offering that enables self-service consumption of compute as well as AI and ML workbenches for fast experimentation and productization of AI-based applications.

Newly added Rafay Platform capabilities include:

Multi-tenancy enforcement: Rafay implements robust multi-tenancy controls that allow GPU cloud providers and enterprises to safely and securely deploy workloads from multiple customers on the same infrastructure without the risk of lateral escalation attacks. The Rafay Platform offers new controls to protect against lateral escalation, including a Kubernetes admission controller that will automatically wrap pods into isolated Kata containers, each of which operate inside a microVM inside a virtual Kubernetes cluster. Additionally, the platform also supports dynamic network policy definition, zero-trust access management and role-based access control (RBAC). Collectively, these controls ensure demonstrable isolation between tenants, allowing for better monetization of expensive infrastructure.

Programmatic SKUs: For both GPU cloud and enterprise platform teams, Rafay allows programmatic definition of compute and service profiles that can be offered to developers and data scientists as a turnkey package, empowering them to focus on building generative AI apps instead of worrying about the infrastructure. By enabling the dynamic definition of self-service packages — programmatic SKUs — GPU cloud and enterprise customers can better manage infrastructure consumption and ensure high utilization based on customer needs.

With Rafay, customers can programmatically package compute resources and AI applications to deliver Small, Medium or Large offerings that end users can select based on their needs and an associated price. For example, Small may be defined as a Jupyter Notebook environment pre-set with a PyTorch environment that is tied to one NVIDIA H100 GPU and is priced at $3 per hour. Medium may be defined as a fine-tuning workbench pre-configured with the Llama 3.1 model and tied to eight NVIDIA H100 GPUs, and priced at $20 per hour. This approach replaces hardcoded SKU definition strategies with a solution that scales, helping GPU cloud providers package their offerings to meet market needs, while giving enterprises control over resource consumption.

Purpose-built AI workbenches: With Rafay’s service profile capabilities, platform teams can provide Rafay’s native fine-tuning and inferencing tools or third-party services, such as NVIDIA NIMs and Run:AI, to create AI workbenches for developers and data scientists. These workbenches come pre-configured with all necessary components to speed the delivery of specialized environments for AI and ML workflows. Platform teams can optionally attach these workbenches to SKUs for self-service consumption.

Chargeback and billing: Rafay provides detailed resource tracking and cost attribution features to help GPU cloud providers and enterprises monitor consumption across their user base. GPU cloud providers can leverage chargeback data to generate billing information for customers. Enterprises can leverage chargeback data to internally manage budgets and cost center attribution.

The new platform capabilities are now generally available to customers in the Rafay Platform.

Additional Resources

Start a free trial today and learn how to deliver an enterprise PaaS experience to developers and data scientists
Follow Rafay on X and LinkedIn
Read the Rafay Blog: The AI and Cloud-native Infrastructure Blog

About Rafay Systems

Rafay’s mission is to liberate enterprises from the pains and complexities of consuming modern compute infrastructure, allowing them to channel 100% of their developers’ focus into innovation. Companies such as MoneyGram, Guardant Health and MassMutual entrust Rafay to be the cornerstone of their AI and cloud-native infrastructure strategy. Teams rely on Rafay’s product-led approach to platform engineering to deliver a Platform-as-a-Service (PaaS) experience to developers and data scientists, while ensuring strict security and cost policy enforcement. Gartner has recognized Rafay as a Cool Vendor in Container Management and GigaOm named Rafay as a Leader and Outperformer in the GigaOm Radar Report for Managed Kubernetes, acknowledging our commitment to driving innovation. To join the ranks of industry leaders who have unlocked the true potential of cloud-native computing with Rafay Systems, please visit www.rafay.co.