Loading…
28-29, August 2025
Amsterdam, Netherlands
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for AI_dev Europe 2025 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Central European Summer Time, CEST (UTC +2). To see the schedule in your preferred timezone, please select from the drop-down menu to the right.

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

Venue: G001-002 clear filter
Thursday, August 28
 

11:15 CEST

Securing AI Pipelines: Real-World Attacks on Kubernetes-Based AI Infrastructure - Abhinav Sharma, KodeKloud
Thursday August 28, 2025 11:15 - 11:40 CEST
When an ML engineer deploys a Stable Diffusion model to Kubernetes, they unwittingly create an attack surface unlike anything traditional security teams have encountered. I discovered this firsthand after our "perfectly secured" AI cluster was compromised.
Speakers
avatar for Abhinav Sharma

Abhinav Sharma

Site Reliability Engineer, KodeKloud
I am Site Reliability Engineer at KodeKloud . I am an Open source contributor, evaluating and contributed in various open source tools and projects, such as, Microsoft's Open source libraries, OpenCV, SUSE, etc. I was also a Google Summer of Code contributor 2022 and a GitHub Extern... Read More →
Thursday August 28, 2025 11:15 - 11:40 CEST
G001-002

11:50 CEST

Unlocking Scalable Distributed Training With Arrow Data Cache on Kubernetes - Ricardo Aravena, Snowflake & Andrey Velichkevich, Apple
Thursday August 28, 2025 11:50 - 12:15 CEST
As the scale of AI models and training datasets grows, so does the complexity of efficiently feeding data into GPU-accelerated training workloads. Traditional I/O stacks are becoming a bottleneck—especially in cloud native environments—where elasticity and performance must go hand in hand. This talk introduces an open-source, Arrow-based data cache for distributed training workloads on Kubernetes and tabular datasets stored as Apache Iceberg tables.
Speakers
avatar for Ricardo Aravena

Ricardo Aravena

Software Engineer, Snowflake
Ricardo is making daily impactful contributions as an AI Infrastructure Lead at Snowflake. He's passionate about open source in various roles, such as co-chairing the CNCF TAG-Runtime and leading the Cloud Native AI Working Group. With over 25 years of experience in the tech industry... Read More →
avatar for Andrey Velichkevich

Andrey Velichkevich

Software Engineer, Apple
Andrey Velichkevich is a Senior Software Engineer at Apple and is a key contributor to the Kubeflow open-source project. He is a member of Kubeflow Steering Committee and a co-chair of Kubeflow AutoML and Training WG. Additionally, Andrey is an active member of the CNCF WG AI. He... Read More →
Thursday August 28, 2025 11:50 - 12:15 CEST
G001-002

13:35 CEST

From Hours To Milliseconds: Scaling AI Inference 10x With Serverless on Kubernetes - Anmol Krishan Sachdeva & Paras Mamgain, Google
Thursday August 28, 2025 13:35 - 14:00 CEST
Imagine deploying a complex AI model for real-time inference, but facing latency that slows down your application. We've all been there. This talk isn't just about theoretical serverless benefits; it's about real-world performance gains. We'll show you how we slashed inference latency from several seconds to under 100 milliseconds, achieving a 10x improvement in throughput, by harnessing the power of serverless on Kubernetes.
Speakers
avatar for Paras Mamgain

Paras Mamgain

Software Engineer, Google
Paras has been an active speaker sharing his technical expertise at Google tech conferences, Linux Foundations Open source summit in Japan and North America. Paras is a highly skilled backend developer with a passion for information retrieval and a knack for translating complex technical... Read More →
avatar for Anmol Krishan Sachdeva

Anmol Krishan Sachdeva

Sr. Hybrid Cloud Architect, Google
Anmol is a seasoned International Tech Speaker (delivered 75+ talks), a Distinguished Guest Lecturer, an active conference organizer, and has published several notable papers. He works at Google and focuses on Emerging Technologies.
Thursday August 28, 2025 13:35 - 14:00 CEST
G001-002

14:10 CEST

Fast Inference, Furious Scaling: Leveraging VLLM With KServe - Rafael Vasquez, IBM
Thursday August 28, 2025 14:10 - 14:35 CEST
In this talk, we will introduce two open-source projects vLLM and KServe and explain how they can be integrated to leverage better performance and scalability for LLMs in production. The session will include a demo showcasing their integration.
Speakers
avatar for Rafael Vasquez

Rafael Vasquez

Open Source Software Developer, IBM
Rafael Vasquez is a software developer on the Open Technology team at IBM. He previously completed an MASc. working on self-driving car research and transitioned from a data scientist role in the retail field to his current role where he continues to grow his passion for MLOps and... Read More →
Thursday August 28, 2025 14:10 - 14:35 CEST
G001-002

14:45 CEST

RamaLama: Making Working With AI Models Cloud Native and Boring - Eric Curtin, Red Hat
Thursday August 28, 2025 14:45 - 15:10 CEST
Managing and deploying AI models can often require extensive system configuration and complex software dependencies. RamaLama, a new open-source tool, aims to make working with AI models straightforward by leveraging container technology, making the process "boring"—predictable, reliable, and easy to manage. RamaLama integrates with container engines like Podman and Docker to deploy AI models within containers, eliminating the need for manual configuration and ensuring optimal setup for both CPU and GPU systems.
Speakers
avatar for Eric Curtin

Eric Curtin

Principal Software Engineer, Red Hat
Principal Software Engineer at Red Hat working on AI and Automotive. Upstream maintainer of RamaLama, llama.cpp inotify-tools, ostree, etc.
Thursday August 28, 2025 14:45 - 15:10 CEST
G001-002

15:40 CEST

From Cold Start To Warp Speed: Triton Kernel Caching With OCI Container Images - Maryam Tahhan & Alessandro Sangiorgi, Red Hat
Thursday August 28, 2025 15:40 - 16:05 CEST
Model startup latency is a persistent bottleneck for modern inference workloads, particularly when using custom kernels written in Triton that are Just In Time (JIT) compiled. In this talk, we’ll present a novel approach to speeding up model boot times by wrapping Triton kernel caches in OCI container images.
Speakers
avatar for Maryam Tahhan

Maryam Tahhan

Principal Software Engineer, Red Hat
Maryam is a Principal Engineer on the Emerging Tech team in the Office of the CTO at Red Hat. Her research is focused on Networking and Sustainability. She's contributed to and led several OpenSource projects. She has been working on AF_XDP and preparing it for cloud native use cases... Read More →
avatar for Alessandro Sangiorgi

Alessandro Sangiorgi

Software Engineer, Red Hat
Alessandro Sangiorgi is a Software Engineer in the Emerging Technologies Group within the Office of the CTO at Red Hat. He has extensive experience across Cloud, Distributed Systems, AI, and Networking products and technologies.
Thursday August 28, 2025 15:40 - 16:05 CEST
G001-002

16:15 CEST

Streamlining AI Pipelines With Elyra: From Development To Inference With KServe & VLLM - Ritesh Shah, Red Hat
Thursday August 28, 2025 16:15 - 16:40 CEST
This session will explore how Elyra, an open source project that extends the JupyterLab user interface to simplify the development of data science and AI models, empowers data scientists and ML engineers to build, automate, and optimize end-to-end AI/ML pipelines with ease. We’ll demonstrate how Elyra’s visual pipeline editor simplifies workflow orchestration while integrating seamlessly with Kubeflow, other MLOps tools.
Speakers
avatar for Ritesh Shah

Ritesh Shah

Senior Principal Architect, Red Hat
Ritesh Shah is a Senior Principal Architect with Red Hat and focuses on creating and using next-generation platforms, including AI/ML workloads as well as application modernisation and deployment.
Thursday August 28, 2025 16:15 - 16:40 CEST
G001-002

16:50 CEST

Scalable LLM Inference on Kubernetes With NVIDIA NIMS, LangChain, Milvus and FluxCD - Riccardo Freschi, AWS
Thursday August 28, 2025 16:50 - 17:15 CEST
Join us for a deep dive into architecting and implementing a scalable LLM inference service on Amazon EKS as the foundation for workload orchestrating, while incorporating NVIDIA NIMS for optimal GPU utilization, LangChain for flexible LLM operations, Milvus for efficient vector storage, FluxCD for GitOps-driven deployments, Karpenter for horizontal scaling and Prometheus and Grafana for Observability.
Speakers
avatar for Riccardo Freschi

Riccardo Freschi

Sr. Solution Architect, AWS
Riccardo Freschi is a Sr. Solutions Architect at AWS, focusing on Application Modernization. He works closely with partners and customers, to help them transform their IT landscapes in their journey to the AWS Cloud, by refactoring existing applications and building new ones, cloud... Read More →
Thursday August 28, 2025 16:50 - 17:15 CEST
G001-002
 
Friday, August 29
 

11:10 CEST

Monitoring GenAI Applications - Prasad Mujumdar, Okahu AI
Friday August 29, 2025 11:10 - 11:35 CEST
GenAI Observability is proactive monitoring of your AI apps and cloud infra they run on to understand how to make them work better. The ever evolving landscape of genAI technologies makes it very challenging to manage. Project. Monocle is built for App developers to trace their app code in any environment without lots of custom code decoration. It’s a community driven open source project under Linux Foundation AI&Data, built on top of OpenTelemetry. It provides out of the box support of several genAI tech components. With very little to no code changes, you can generate OpenTelemetry compatible traces and spans of your genAI application. Monocle provides consistent format to describe entities like LLMs and vector stores as well as events like prompts and responses. It can be integrated with apps in personal dev/lab environment as well as cloud deployments.
Speakers
avatar for Prasad Mujumdar

Prasad Mujumdar

CTO, Okahu AI
Prasad is founder and CTO of Okahu AI. He's leads the LF's Data&AI project Monocle. Prasad has extensive experience in data management, data governance and AI technology, in past worked in IBM, Microsoft and Cloudera at technical leadership positions.
Friday August 29, 2025 11:10 - 11:35 CEST
G001-002

11:45 CEST

Unlocking How To Train 107B Model Across Multi-Region Heterogeneous Clusters Using K8s - Xiao Zhang, dynamia.ai
Friday August 29, 2025 11:45 - 12:10 CEST
As large language models enter the era of hundreds of billions of parameters, the traditional single - cluster LLM training method faces three challenges. First, Difficulty building large homogeneous GPU clusters. Second, there is a physical limit to the scale of AI accelerators in a single cluster. Third, resource fragmentation leads to insufficient resources in a single cluster while there are abundant global resources. multi-domain heterogeneous training technology can effectively address these challenges.
Speakers
avatar for Xiao Zhang

Xiao Zhang

Software Engineer & CEO, dynamia.ai
Xiao Zhang is the leader of the Container team (focus on infra, AI, Multi-Cluster, Cluster - LCM, OCI). He is also an active community contributor and cloud native enthusiast. He is currently a member of Kubernetes / Kubernetes-sigs, maintainer of Karmada, kubean, HAMi, and cloudtty... Read More →
Friday August 29, 2025 11:45 - 12:10 CEST
G001-002

13:35 CEST

Architecting the AI Bridge: Integrating LLMs and DMN With Drools, Spring Boot, and the MCP Protocol - Tim Wuthenow & Alex Porcelli, Aletyx, Inc.
Friday August 29, 2025 13:35 - 14:00 CEST
As GenAI continues to gain traction, enterprise architects face a critical challenge: how to harness the flexibility of LLMs while ensuring structure, compliance, and explainability. This talk presents a modern AI architecture that bridges Symbolic AI (via DMN and rule engines) with Non-Symbolic AI (via LLMs), enabling powerful, guardrail-enabled systems built on open standards.
Speakers
avatar for Alex Porcelli

Alex Porcelli

Aletyx Co-Founder, Aletyx, Inc.
Alex Porcelli is a seasoned Architect and Engineer Leader with over 25 years of professional development experience. A passionate open-source advocate, he has actively contributed to projects like Drools, jBPM, Kogito, Hibernate, and more for over 15 years. Alex spent more than a... Read More →
avatar for Tim Wuthenow

Tim Wuthenow

Co-Founder, Aletyx, Inc.
Enterprise AI architect with 15+ years in business automation across IBM and Red Hat. Open source contributor to Drools, jBPM, and Kogito. Specialized in integrating rule engines with AI for compliant, explainable systems. Implemented mission-critical decision services in healthcare... Read More →
Friday August 29, 2025 13:35 - 14:00 CEST
G001-002

14:10 CEST

Benchmarking GenAI Like a Pro: Scaling Experiments, Predicting Performance, and Keeping Your Sanity - Michael Johnston, IBM
Friday August 29, 2025 14:10 - 14:35 CEST
Generative AI is moving fast—and if you're responsible for deploying or tuning these models, you're probably feeling the heat. New LLMs, hardware, and training methods are landing constantly. How do you make sense of it all? How do you actually know what’s performant, what’s cost-effective, and what breaks the moment your stack changes?
Speakers
avatar for Michael Johnston

Michael Johnston

Research Scientist & Manager, Next Generation Systems Team, IBM Research Ireland
Michael Johnston is Research Scientist and Manager of the Next Generation Systems team at IBM Research Ireland. His background is in High Performance Computing, application design, and computational biophysics and biochemistry.  He has worked closely with the Hartree Centre, IBM’s collaboration with the UK’s Science and Technology Facilities Council (STFC), on a range of projects. This included the Square Kilometre Array (SKA) Telescope, aimed at studying the universe in unprecedented detail, and Oasis — a tool for estimating the cost of catastrophes... Read More →
Friday August 29, 2025 14:10 - 14:35 CEST
G001-002

14:45 CEST

Building AI Workflows: From Local Experiments To Serving Users - Oleg Šelajev, Docker
Friday August 29, 2025 14:45 - 15:10 CEST
Everyone can throw together an LLM, some MCP tools, and a chat interface, and get an AI assistant we could only dream of a few years back. Add some “business logic” prompts, and you get an AI workflow; hopefully a helpful one. 
Speakers
avatar for Oleg Šelajev

Oleg Šelajev

Devrel engineer, Docker
Oleg Šelajev is a developer advocate at Docker working mainly on developer productivity, Testcontainers, improving how we set up local development environments and tests, and building applications with AI parts. Developer. Author. Speaker. Java Champion. Docker captain. 
Friday August 29, 2025 14:45 - 15:10 CEST
G001-002

15:35 CEST

AI/ML Networking Challenges: The Fast and the Finnicky! - Lerna Ekmekcioglu, Clockwork Systems 
Friday August 29, 2025 15:35 - 16:00 CEST
Race cars are engineered for peak performance, designed to push the limits of speed while maintaining control and stability. Similarly, AI/ML jobs need a high-speed, reliable network fabric to deliver results efficiently. Just like a race car depends on a well-maintained race track to perform at its best, AI/ML jobs rely on network fabrics—such as RoCE and InfiniBand—to ensure fast, reliable communications.
Speakers
avatar for Lerna Ekmekcioglu

Lerna Ekmekcioglu

Sr Solutions Engineer , Clockwork Systems 
Lerna is a Sr Solutions Eng at Clockwork Systems where she helps customers meet their performance goals with software solutions built on Clockwork.io’s foundational research. Prior to this, she was a Sr Solutions Architect at AWS for 3 yrs. Lerna spent 17 yrs as an infrastructure... Read More →
Friday August 29, 2025 15:35 - 16:00 CEST
G001-002

16:10 CEST

Declarative Device Virtualization: Orchestrating GPUs & Hardware in Cloud Native Environments - Samrat Priyadarshi, Google
Friday August 29, 2025 16:10 - 16:35 CEST
This session explores how declarative pipelines revolutionize GPU and hardware virtualization within Kubernetes. We'll address the challenges of managing specialized hardware resources in cloud-native applications and demonstrate how to orchestrate virtualized devices with ease. Attendees will learn to define device configurations using YAML, deploy virtualized GPUs, FPGAs, and other accelerators into Kubernetes clusters, and automate complex hardware interactions. We'll cover practical examples of using declarative pipelines for AI/ML workloads, edge computing, and high-performance computing (HPC). This talk empowers developers and operators to unlock the full potential of hardware resources, building scalable, resilient, and adaptable device virtualization solutions, specifically focusing on GPU management and optimization.
Speakers
avatar for Samrat Priyadarshi

Samrat Priyadarshi

Cloud Engineer, Google
Samrat is a Cloud Engineer at Google with 8 years of experience in Cloud Computing focussing mainly on Kubernetes and related landscapes. He has delivered multiple international and national conferences including Open Source Summit, Japan, 2024. He has a Youtube channel with more... Read More →
Friday August 29, 2025 16:10 - 16:35 CEST
G001-002

16:45 CEST

AI Showdown: Open Source Tools for LLM Face-Offs - Jigyasa Grover, Independent & Rishabh Misra, Attentive
Friday August 29, 2025 16:45 - 17:10 CEST
In this session, we will explore how open-source tools like LLM Comparator can revolutionize the way we evaluate and compare LLMs. LLM Comparator, an interactive tool with a companion Python library, enables developers to scale and simplify the process of side-by-side model evaluation. We'll cover how this open-source solution enhances transparency, fosters collaboration, and accelerates model optimization within the AI community.
Speakers
avatar for Jigyasa Grover

Jigyasa Grover

AI Lead, Data Scientist & ML Engineer • 10x AI & Open Source Award Winner • Google Developer Expert • 'Sculpting Data for ML' Book Author
Jigyasa Grover, a 10-time AI award winner and author of book, Sculpting Data For ML, has expertise in ML engineering at Twitter, Facebook, and Faire. A Google Developer Expert and Women Techmaker Ambassador, she was featured at at Google I/O 2024 on Gemini 1.5 Pro. As a World Economic... Read More →
avatar for Rishabh Misra

Rishabh Misra

Staff Machine Learning Engineer, Attentive
Author of the book "Sculpting Data for ML", I am a Staff ML Engineer & Researcher recognized by the US Government for outstanding contribution to ML research. I have extensively published and reviewed research at top AI conferences in NLP (LLMs / GenAI), Deep Learning, and Applied... Read More →
Friday August 29, 2025 16:45 - 17:10 CEST
G001-002
 
  • Filter By Date
  • Filter By Venue
  • Filter By Type
  • Timezone

Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.