Scaleway invites you to their event

Scaling LLMs at Half the Cost: Unlock the Power of Batch API

About this event

πŸ‡¬πŸ‡§ Webinar in English

Running large-scale LLM workloads comes with two major challenges: infrastructure costs and throughput limitations.

As AI adoption accelerates, teams need more efficient ways to process massive volumes of inference requests without compromising performance, reliability, or budget.

In this webinar, our AI experts will introduce Scaleway’s new Batch API and demonstrate how asynchronous processing can help you scale AI workloads more efficiently while reducing inference costs by up to 50%.

You’ll discover how to transition from synchronous to asynchronous architectures using Object Storage, remove rate limit bottlenecks, and deploy scalable batching workflows for real-world AI use cases.

Key topics:

  • Reduce inference costs by up to 50% with Batch API
  • Scale LLM workloads without hitting rate limits
  • Move from synchronous to asynchronous AI architectures
  • Learn best practices for high-volume AI processing
  • Deploy batching workflows for Chat, Embeddings, and Rerank use cases
  • Live demo: build and launch a batching job in minutes

Hosted by

  • Team member
    T
    Mickael Nechachby Customer Success Manager @ Scaleway
  • External speaker
    E
    Franck Pagny Product Manager - AI Products @ Scaleway

Scaleway

Europe's empowering cloud provider

Scaleway, leading alternative European infrastructure and platform as-a-service provider (IaaS and PaaS), is catering to the global market with the essential mix of cloud computing resources that is flexible, cost effective, reliable, secure and sustainably powered.