Episode
Intelligent Apps on AKS Ep02: Bring Your Own AI Models to Intelligent Apps on AKS with Kaito
with Paul Yu, Ishaan Sehgal, Steven Murawski
Join us to learn how to run open-source Large Language Models (LLMs) with HTTP-based inference endpoints inside your AKS cluster using the Kubernetes AI Toolchain Operator (KAITO). We’ll walk through the setup and deployment of containerized LLMs on GPU node pools and see how KAITO can help reduce operational burden of provisioning GPU nodes and tuning model deployment parameters to fit GPU profiles.
Learning objectives
- Learn how to extend existing microservices with AI capabilities.
- Understand using progressive enhancement to integrate AI capabilities in existing applications.
- Learn how to use open source or custom Large Language Models (LLM) with existing applications.
- Learn how to run open source or custom Large Language Models on Azure Kubernetes Service
Chapters
- 00:00 - Introduction
- 02:40 - Learning objectives
- 04:35 - Demo - Deploy Aks store demo app
- 11:00 - AI workloads on AKS
- 15:53 - AI and ML on AKS
- 34:40 - What is Kaito?
- 42:03 - Challenges with BYO Models
- 44:49 - Demo
- 01:16:04 - Summary
Recommended resources
Related episodes
- Full series: Learn Live: Intelligent Apps on AKS
Connect
- Paul Yu | LinkedIn: /in/yupaul
- Ishaan Sehgal | LinkedIn: /in/ishaan-sehgal
- Steven Murawski | Twitter: @StevenMurawski | LinkedIn: /in/usepowershell
Join us to learn how to run open-source Large Language Models (LLMs) with HTTP-based inference endpoints inside your AKS cluster using the Kubernetes AI Toolchain Operator (KAITO). We’ll walk through the setup and deployment of containerized LLMs on GPU node pools and see how KAITO can help reduce operational burden of provisioning GPU nodes and tuning model deployment parameters to fit GPU profiles.
Learning objectives
- Learn how to extend existing microservices with AI capabilities.
- Understand using progressive enhancement to integrate AI capabilities in existing applications.
- Learn how to use open source or custom Large Language Models (LLM) with existing applications.
- Learn how to run open source or custom Large Language Models on Azure Kubernetes Service
Chapters
- 00:00 - Introduction
- 02:40 - Learning objectives
- 04:35 - Demo - Deploy Aks store demo app
- 11:00 - AI workloads on AKS
- 15:53 - AI and ML on AKS
- 34:40 - What is Kaito?
- 42:03 - Challenges with BYO Models
- 44:49 - Demo
- 01:16:04 - Summary
Recommended resources
Related episodes
- Full series: Learn Live: Intelligent Apps on AKS
Connect
- Paul Yu | LinkedIn: /in/yupaul
- Ishaan Sehgal | LinkedIn: /in/ishaan-sehgal
- Steven Murawski | Twitter: @StevenMurawski | LinkedIn: /in/usepowershell
Have feedback? Submit an issue here.