Iceberg for Agents: Turning Lakehouse Data into AI Context

Thursday, January 8 2026 at 8:30 pm (IST)

About 1 hour

About this event

AI agents fail in production because they're overwhelmed with data but starved for context. LLM models aren't the problem. The bottleneck is the data stack: fragmented silos, inconsistent definitions, and logic hidden in tribal knowledge. Agents need structured, reliable, and interpretable context, not just data access.

In this session, we'll show how Apache Iceberg becomes the backbone of AI-ready pipelines. You'll learn how to elevate your Iceberg implementation from a storage format to a live context layer that powers structured retrieval-augmented generation (RAG), schema-aware agents, and autonomous reasoning grounded in truth.

What we'll cover:

Iceberg Foundations for AI - from ACID to Time Travel
From Rows to Relationships - The role of the semantic layer
Structured RAG in Practice - Fully open source

The session includes a live demo of a fully open-source Structured RAG stack built on Apache Iceberg, featuring semantic query translation, hybrid retrieval, and governed agent reasoning. Expect architecture diagrams, real code, and practical guidance.

Hosted by

External speaker

E
Andrew Madson
Team member

T
Harsha Kalbalia GTM @ Datazip | Founding Member @ Datazip

Harsha is a user-first GTM specialist at Datazip, transforming early-stage startups from zero to one. With a knack for technical market strategy and a startup enthusiast's mindset, she bridges the gap between innovative solutions and meaningful market adoption.

OLake by Datazip

Fastest way to replicate your data to Apache Iceberg.

OLake is an open-source data ingestion tool available on GitHub, developed by Datazip, Inc. Its primary function is to replicate data from transactional databases and streaming platforms (like PostgreSQL, MySQL, MongoDB, Oracle, and Kafka) into open data lakehouse formats, like Apache Iceberg.

View all events

Share this event

Copy permalink