1.5 KiB
1.5 KiB
title | weight | tags | |||
---|---|---|---|---|---|
AI Keynote discussion | 2 |
|
A podium discussion (somewhat scripted) lead by Pryanka
Guests
- Tim from Mistral
- Paige from Google AI
- Jeff founder of OLLAMA
Discussion
- What do you use as the base of dev for ollama
- Jeff: The concepts from docker, git, kubernetes
- How is the balance between ai engi and ai ops
- Jeff: The classic dev vs ops devide, many ML-Engi don't think about
- Paige: Yessir
- How does infra keep up with the fast research
- Paige: Well, they don't - but they do their best and Cloudnative is cool
- Jeff: Well we're not google, but kubernetes is the saviour
- What are scaling constraints
- Jeff: Currently sizing of models is still in it's infancy
- Jeff: There will be more specific hardware and someone will have to support it
- Paige: Sizing also depends on latency needs (code autocompletion vs performance optimization)
- Paige: Optimization of smaller models
- What technologies need to be open source licensed
- Jeff: The model b/c access and trust
- Tim: The models and base execution environemtn -> Vendor agnosticism
- Paige: Yes and remixes are really imporant for development
- Anything else
- Jeff: How do we bring our awesome tools (monitoring, logging, security) to the new AI world
- Paige: Currently many people just use paid apis to abstract the infra, but we need this stuff selfhostable
- Tim: I don'T want to know about the hardware, the whole infra side should be done by the cloudnative teams to let ML-Engi to just be ML-Engi