OLake by Datazip invites you to their event

Best practices for migrating to Apache Iceberg

About this event

[Key highlights]

  • Diving into File formats, compression strategies, and write patterns 
  • Practical guidance on Merge on Read (MoR) vs Copy on Write (CoW) implementation
  • Essential configurations for maintenance and monitoring 
  • Benchmarks [duration & cost] compared with Amazon EMR Trino, Snowflake, Snowflake Iceberg, Starburst, Athena

[Further learn] 

  • How to select tables for migration and assess critical queries
  • Optimal compaction strategies (BinPack, Sort, Z-order)
  • Key configurations for production deployment
  • Monitoring best practices using Iceberg virtual tables

Hosted by

  • Guest speaker
    G
    Yonatan Dolan Principal Analytics Specialist @ AWS

    Yonatan Dolan, a Principal Analytics Specialist at AWS, focusing on Big Data & Analytics in Israel. He's an Apache Iceberg evangelist and actively drives data lake innovations. Before AWS, he led Intel's Pharma Analytics Platform, developing edge-to-cloud AI solutions for clinical trials, and spent 9 years driving advanced analytics projects at Intel.

  • Guest speaker
    G
    Amit Gilad Data Engineer

    Amit Gilad, a Data Engineer who's been actively working with Apache Iceberg and data lakes. Currently leading data engineering in stealth, he previously worked as a data engineer at Cloudinary. He has hands-on experience with EMR, Athena, and Spark, and recently shared insights about Iceberg implementations without Spark at the Chill Data Summit.

  • Team member
    T
    Harsha Kalbalia GTM @ Datazip | Founding Member @ Datazip

    Harsha is a user-first GTM specialist at Datazip, transforming early-stage startups from zero to one. With a knack for technical market strategy and a startup enthusiast's mindset, she bridges the gap between innovative solutions and meaningful market adoption.

OLake by Datazip

Fastest way to replicate your data to Apache Iceberg.

OLake is an open-source data ingestion tool available on GitHub, developed by Datazip, Inc. Its primary function is to replicate data from transactional databases and streaming platforms (like PostgreSQL, MySQL, MongoDB, Oracle, and Kafka) into open data lakehouse formats, like Apache Iceberg.