AI_dev: Open Source GenAI & ML Summit Europe 2025: Full Schedule

28-29, August 2025
Amsterdam, Netherlands
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for AI_dev Europe 2025 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Central European Summer Time, CEST (UTC +2). To see the schedule in your preferred timezone, please select from the drop-down menu to the right.

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

arrow_back View All Dates

11:15 CEST

Securing AI Pipelines: Real-World Attacks on Kubernetes-Based AI Infrastructure - Abhinav Sharma, KodeKloud

Thursday August 28, 2025 11:15 - 11:40 CEST

G001-002

When an ML engineer deploys a Stable Diffusion model to Kubernetes, they unwittingly create an attack surface unlike anything traditional security teams have encountered. I discovered this firsthand after our "perfectly secured" AI cluster was compromised.

Speakers

Abhinav Sharma

Site Reliability Engineer, KodeKloud

I am Site Reliability Engineer at KodeKloud . I am an Open source contributor, evaluating and contributed in various open source tools and projects, such as, Microsoft's Open source libraries, OpenCV, SUSE, etc. I was also a Google Summer of Code contributor 2022 and a GitHub Extern... Read More →

Thursday August 28, 2025 11:15 - 11:40 CEST
G001-002

Breakout Sessions, Scalability and Cloud-Native AI

11:50 CEST

Unlocking Scalable Distributed Training With Arrow Data Cache on Kubernetes - Ricardo Aravena, Snowflake & Andrey Velichkevich, Apple

Thursday August 28, 2025 11:50 - 12:15 CEST

G001-002

As the scale of AI models and training datasets grows, so does the complexity of efficiently feeding data into GPU-accelerated training workloads. Traditional I/O stacks are becoming a bottleneck—especially in cloud native environments—where elasticity and performance must go hand in hand. This talk introduces an open-source, Arrow-based data cache for distributed training workloads on Kubernetes and tabular datasets stored as Apache Iceberg tables.

Speakers

Ricardo Aravena

Software Engineer, Snowflake

Ricardo is making daily impactful contributions as an AI Infrastructure Lead at Snowflake. He's passionate about open source in various roles, such as co-chairing the CNCF TAG-Runtime and leading the Cloud Native AI Working Group. With over 25 years of experience in the tech industry... Read More →

Andrey Velichkevich

Software Engineer, Apple

Andrey Velichkevich is a Senior Software Engineer at Apple and is a key contributor to the Kubeflow open-source project. He is a member of Kubeflow Steering Committee and a co-chair of Kubeflow AutoML and Training WG. Additionally, Andrey is an active member of the CNCF WG AI. He... Read More →

Thursday August 28, 2025 11:50 - 12:15 CEST
G001-002

Breakout Sessions, Scalability and Cloud-Native AI

13:35 CEST

From Hours To Milliseconds: Scaling AI Inference 10x With Serverless on Kubernetes - Anmol Krishan Sachdeva & Paras Mamgain, Google

Thursday August 28, 2025 13:35 - 14:00 CEST

G001-002

Imagine deploying a complex AI model for real-time inference, but facing latency that slows down your application. We've all been there. This talk isn't just about theoretical serverless benefits; it's about real-world performance gains. We'll show you how we slashed inference latency from several seconds to under 100 milliseconds, achieving a 10x improvement in throughput, by harnessing the power of serverless on Kubernetes.

Speakers

Paras Mamgain

Software Engineer, Google

Paras has been an active speaker sharing his technical expertise at Google tech conferences, Linux Foundations Open source summit in Japan and North America. Paras is a highly skilled backend developer with a passion for information retrieval and a knack for translating complex technical... Read More →

Anmol Krishan Sachdeva

Sr. Hybrid Cloud Architect, Google

Anmol is a seasoned International Tech Speaker (delivered 75+ talks), a Distinguished Guest Lecturer, an active conference organizer, and has published several notable papers. He works at Google and focuses on Emerging Technologies.

Thursday August 28, 2025 13:35 - 14:00 CEST
G001-002

Breakout Sessions, Inferencing and GPU Acceleration

14:10 CEST

Fast Inference, Furious Scaling: Leveraging VLLM With KServe - Rafael Vasquez, IBM

Thursday August 28, 2025 14:10 - 14:35 CEST

G001-002

In this talk, we will introduce two open-source projects vLLM and KServe and explain how they can be integrated to leverage better performance and scalability for LLMs in production. The session will include a demo showcasing their integration.

Speakers

Rafael Vasquez

Open Source Software Developer, IBM

Rafael Vasquez is a software developer on the Open Technology team at IBM. He previously completed an MASc. working on self-driving car research and transitioned from a data scientist role in the retail field to his current role where he continues to grow his passion for MLOps and... Read More →

Thursday August 28, 2025 14:10 - 14:35 CEST
G001-002

Breakout Sessions, Scalability and Cloud-Native AI

14:45 CEST

RamaLama: Making Working With AI Models Cloud Native and Boring - Eric Curtin, Red Hat

Thursday August 28, 2025 14:45 - 15:10 CEST

G001-002

Managing and deploying AI models can often require extensive system configuration and complex software dependencies. RamaLama, a new open-source tool, aims to make working with AI models straightforward by leveraging container technology, making the process "boring"—predictable, reliable, and easy to manage. RamaLama integrates with container engines like Podman and Docker to deploy AI models within containers, eliminating the need for manual configuration and ensuring optimal setup for both CPU and GPU systems.

Speakers

Eric Curtin

Principal Software Engineer, Red Hat

Principal Software Engineer at Red Hat working on AI and Automotive. Upstream maintainer of RamaLama, llama.cpp inotify-tools, ostree, etc.

Thursday August 28, 2025 14:45 - 15:10 CEST
G001-002

Breakout Sessions, Inferencing and GPU Acceleration

15:40 CEST

From Cold Start To Warp Speed: Triton Kernel Caching With OCI Container Images - Maryam Tahhan & Alessandro Sangiorgi, Red Hat

Thursday August 28, 2025 15:40 - 16:05 CEST

G001-002

Model startup latency is a persistent bottleneck for modern inference workloads, particularly when using custom kernels written in Triton that are Just In Time (JIT) compiled. In this talk, we’ll present a novel approach to speeding up model boot times by wrapping Triton kernel caches in OCI container images.

Speakers

Maryam Tahhan

Principal Software Engineer, Red Hat

Maryam is a Principal Engineer on the Emerging Tech team in the Office of the CTO at Red Hat. Her research is focused on Networking and Sustainability. She's contributed to and led several OpenSource projects. She has been working on AF_XDP and preparing it for cloud native use cases... Read More →

Alessandro Sangiorgi

Software Engineer, Red Hat

Alessandro Sangiorgi is a Software Engineer in the Emerging Technologies Group within the Office of the CTO at Red Hat. He has extensive experience across Cloud, Distributed Systems, AI, and Networking products and technologies.

Thursday August 28, 2025 15:40 - 16:05 CEST
G001-002

Breakout Sessions, Scalability and Cloud-Native AI

16:15 CEST

Streamlining AI Pipelines With Elyra: From Development To Inference With KServe & VLLM - Ritesh Shah, Red Hat

Thursday August 28, 2025 16:15 - 16:40 CEST

G001-002

This session will explore how Elyra, an open source project that extends the JupyterLab user interface to simplify the development of data science and AI models, empowers data scientists and ML engineers to build, automate, and optimize end-to-end AI/ML pipelines with ease. We’ll demonstrate how Elyra’s visual pipeline editor simplifies workflow orchestration while integrating seamlessly with Kubeflow, other MLOps tools.

Speakers

Ritesh Shah

Senior Principal Architect, Red Hat

Ritesh Shah is a Senior Principal Architect with Red Hat and focuses on creating and using next-generation platforms, including AI/ML workloads as well as application modernisation and deployment.

Thursday August 28, 2025 16:15 - 16:40 CEST
G001-002

Breakout Sessions, AI Code Generation and Developer Tools

16:50 CEST

Scalable LLM Inference on Kubernetes With NVIDIA NIMS, LangChain, Milvus and FluxCD - Riccardo Freschi, AWS

Thursday August 28, 2025 16:50 - 17:15 CEST

G001-002

Join us for a deep dive into architecting and implementing a scalable LLM inference service on Amazon EKS as the foundation for workload orchestrating, while incorporating NVIDIA NIMS for optimal GPU utilization, LangChain for flexible LLM operations, Milvus for efficient vector storage, FluxCD for GitOps-driven deployments, Karpenter for horizontal scaling and Prometheus and Grafana for Observability.

Speakers

Riccardo Freschi

Sr. Solution Architect, AWS

Riccardo Freschi is a Sr. Solutions Architect at AWS, focusing on Application Modernization. He works closely with partners and customers, to help them transform their IT landscapes in their journey to the AWS Cloud, by refactoring existing applications and building new ones, cloud... Read More →

Thursday August 28, 2025 16:50 - 17:15 CEST
G001-002

Breakout Sessions, Inferencing and GPU Acceleration