Scaling LLMs at Half the Cost: Unlock the Power of Batch API

Thursday, June 11 2026 at 11:00 am (CEST)

About 55 minutes

About this event

🇬🇧 Webinar in English

Running large-scale LLM workloads comes with two major challenges: infrastructure costs and throughput limitations.

As AI adoption accelerates, teams need more efficient ways to process massive volumes of inference requests without compromising performance, reliability, or budget.

In this webinar, our AI experts will introduce Scaleway’s new Batch API and demonstrate how asynchronous processing can help you scale AI workloads more efficiently while reducing inference costs by up to 50%.

You’ll discover how to transition from synchronous to asynchronous architectures using Object Storage, remove rate limit bottlenecks, and deploy scalable batching workflows for real-world AI use cases.

Key topics:

Reduce inference costs by up to 50% with Batch API
Scale LLM workloads without hitting rate limits
Move from synchronous to asynchronous AI architectures
Learn best practices for high-volume AI processing
Deploy batching workflows for Chat, Embeddings, and Rerank use cases
Live demo: build and launch a batching job in minutes

Hosted by

Team member

T
Mickael Nechachby Customer Success Manager @ Scaleway
External speaker

E
Franck Pagny Product Manager - AI Products @ Scaleway

Scaleway

Europe's empowering cloud provider

Scaleway, leading alternative European infrastructure and platform as-a-service provider (IaaS and PaaS), is catering to the global market with the essential mix of cloud computing resources that is flexible, cost effective, reliable, secure and sustainably powered.

View all events

Share this event

Copy permalink