OLake by Datazip invites you to their event

Apache Iceberg Catalog Deep Dive and Demo: Comparing Glue, Polaris, Lakekeeper, & Beyond in 2025.

About this event

Webinar Overview

As Apache Iceberg continues its rapid evolution and the catalog ecosystem expands, data engineers must make pivotal decisions about metadata management that directly influence query performance, costs, and operational complexity. Join this technical deep dive into the current catalog landscape, complete with live implementations, performance comparisons, and insights into leading solutions—including the newly GA'd Polaris v1.0 and emerging innovators reshaping the field.

Agenda

  • Why Iceberg?
  • The problems Iceberg solves
  • Break down its underlying architecture including storage, metadata, and catalog layers.
  • Why You Need Catalogs?
  • The limitations of raw Iceberg usage without catalogs. Alongside a live demo
  • Demo Time- Catalog Setup Differences and Unique Features 
  • Demonstrate setup variations and distinctive features across catalogs, Glue (enterprise integration and Lake Formation), Polaris (REST API and multi-engine support), LakeKeeper (serverless auto-scaling), Nessie (branching for experimentation)
  • Ecosystem Update Since February 2025 
  • Q&A and Conclusion 

Who Should Attend

  • Data Engineers implementing or migrating Iceberg table formats in production
  • Platform Engineers evaluating catalog solutions for multi-petabyte data lakes
  • Data Architects designing metadata management strategies for hybrid cloud environments
  • DevOps Engineers managing data infrastructure costs and performance optimization
  • Technical Leaders making vendor selection decisions for lakehouse architectures

If you're passionate about Apache Iceberg and eager to level up your skills in cutting-edge data engineering, this webinar is for you.

Key Takeaways

By the end of this session, you'll gain:

  • A clear understanding of why Iceberg is essential and how catalogs address its core challenges
  • Hands-on insights into setting up and differentiating major catalogs with their standout features
  • Updates on the latest 2025 ecosystem shifts and their implications for your data strategies
  • Practical guidance on migrations, integrations, and production best practices

Technical Prerequisites

This webinar assumes familiarity with:

  • Apache Iceberg table format basics and metadata structure
  • SQL engines (Spark, Trino, or similar) and lakehouse query patterns
  • Cloud storage systems (S3, GCS, ADLS) and object store performance characteristics
  • Basic understanding of distributed systems and eventual consistency models
  • All demonstrations will include configuration examples and performance metrics to enable immediate implementation in your production environments.

Don't miss this opportunity to dive deep into Apache Iceberg's catalog ecosystem with experts from the community. Register now and join fellow data engineers in advancing your Iceberg expertise. 

Hosted by

  • Team member
    T
    Sandeep Devarapalli Co-founder and CEO @ Datazip, Inc.

  • Team member
    T
    Akshay Sharma DevRel @ Datazip

    Developer Advocate at Datazip, helping engineers and contributors adopt open lakehouse technologies. I manage our contributor community and showcase how OLake delivers the fastest data replication framework to teams building at scale.

  • Team member
    T
    Harsha Kalbalia GTM @ Datazip | Founding Member @ Datazip

    Harsha is a user-first GTM specialist at Datazip, transforming early-stage startups from zero to one. With a knack for technical market strategy and a startup enthusiast's mindset, she bridges the gap between innovative solutions and meaningful market adoption.

  • Guest speaker
    G
    Arsham Eslami Greybeam

    Arsham Eslami is Co-Founder of Greybeam, where he builds automated workload routing systems to optimize Snowflake costs and performance. His technical expertise spans distributed systems, real-time analytics, and cost optimization for cloud-native data architectures. He is an active contributor to open-source data tooling and infrastructure automation

OLake by Datazip

Fastest way to replicate your data to Apache Iceberg.

OLake is an open-source data ingestion tool available on GitHub, developed by Datazip, Inc. Its primary function is to replicate data from transactional databases and streaming platforms (like PostgreSQL, MySQL, MongoDB, Oracle, and Kafka) into open data lakehouse formats, like Apache Iceberg.