kubecon24/content/day2/02_ai_keynote.md

---
title: AI Keynote discussion
weight: 2
tags:
  - keynote
  - ai
  - panel
---

<!-- {{% button href="https://youtu.be/VhloarnpxVo" style="warning" icon="video" %}}Watch talk on YouTube{{% /button %}} -->

A podium discussion (somewhat scripted) lead by Priyanka

## Guests

* Tim from Mistral
* Paige from Google AI
* Jeff founder of OLLAMA

## Discussion

* What do you use as the base of dev for OLLAMA
  * Jeff: The concepts from docker, git, Kubernetes
* How is the balance between AI engineer and AI ops
  * Jeff: The classic dev vs ops divide, many ML-Engineer don't think about
  * Paige: Yessir
* How does infra keep up with the fast research
  * Paige: Well, they don't - but they do their best and Cloud native is cool
  * Jeff: Well we're not google, but Kubernetes is the savior
* What are scaling constraints
  * Jeff: Currently sizing of models is still in its infancy
  * Jeff: There will be more specific hardware and someone will have to support it
  * Paige: Sizing also depends on latency needs (code autocompletion vs performance optimization)
  * Paige: Optimization of smaller models
* What technologies need to be open source licensed
  * Jeff: The model b/c access and trust
  * Tim: The models and base execution environment -> Vendor agnosticism
  * Paige: Yes and remixes are really important for development
* Anything else
  * Jeff: How do we bring our awesome tools (monitoring, logging, security) to the new AI world
  * Paige: Currently many people just use paid APIs to abstract the infra, but we need this stuff self-hostable
  * Tim: I don't want to know about the hardware, the whole infra side should be done by the cloud native teams to let ML-Engineer to just be ML-Engine