theme

layout

lineNumbers

themeConfig

title

info

class

highlighter

drawings

mdc

hideInToc

favicon

titleTemplate

slidev-theme-dataroots

intro

false

title	github	twitter	linkedin
MLOps - Bringing ideas to production 🚀	murilo-cunha	_murilocunha	in/murilo-cunha

MLOps

## MLOps

text-center

shiki

persist
true

true

https://raw.githubusercontent.com/datarootsio/slidev-theme-dataroots/main/components/assets/symbol-rainbow.png

%s

[MLOops]{v-mark.crossed-off=1} to MLOps

Bringing ideas to production 🚀

March 15th, 2024

hideInToc: true

Agenda

::left::

::right::

layout: presenter photo: https://charlas.2023.es.pycon.org/media/avatars/blob_p6VtbO9.jpg

About me

🇧🇷 → 🇧🇪: Brazilian @ Belgium
🤓 B.Sc. in Mechanical Engineering @PNW
👨‍🎓 M.Sc. in Artificial Intelligence @KUL
Professional Data & ML Engineer
Machine Learning Specialty
Terraform Associate
DAG Authoring & Airflow
SnowPro Core
Prefect Associate
🤪 Fun facts: 🐍, 🦀, 🐓
🫂 Python User Group Belgium
📣 Confs: 🇯🇵, 🇵🇱, 🇮🇪, 🇵🇹, 🇪🇸, 🇸🇪
🎙️ Datatopics Unplugged Podcast
🤖 Tech lead AI @

hideInToc: true layout: default

I have worked on different [Data/AI projects]{.gradient-text}

::left::

Events company 📣
No show prediction 🫥
Record deduplication 👯‍♀️
Recommend visitors and exhibitors 🤝
PoC MVP Production 🚀

::right::

hideInToc: true layout: twocols

From the [prototyping]{.gradient-text} side...

::left::

::right::

Content moderation @ social media company 🤬
NER @ clinical studies 🔎
Q&A chatbots @ automotive industry 🏎️
Energy consumption forecasting @ public sector 📈
Network analysis @ accounting company 🕸️

hideInToc: true

...to
production
applications

::left::

Finacial sector 💰
Early customer lifetime value 🤑
Pipeline migrations 🧑‍🔧
Churn prediction 🫠

::right::

hideInToc: true

Why am I [here]{.gradient-text}?

" To help us understand what it takes for a machine learning project takes to go from [idea to production]{v-mark.box.yellow=1}, looking closely at the differences between [machine learning]{.gradient-text} and [operations]{.gradient-text} "

Use case: [content moderation]{.gradient-text}

“pixel art angry face with symbols on mouth censoring profanity” - DALL·E 2

hideInToc: true

What is [content moderation]{.gradient-text}?

::left::

You're the CEO of 10gag ( congrats! 🎉 )
- (Like 9gag, but better)
Things haven't been so good lately 🫣
You have some trolls leaving nasty comments 🤬
You have an idea! 💡
- You can probably detect these comments, and remove them from the platform
- How well can we identify these comments using machine learning?

::right::

{.rounded-lg .shadow-lg .scale-80}

hideInToc: true

So you get to [work]{.gradient-text}...

::left::

{.rounded.shadow.scale-120.mx-10 v-click}

::right::

{.rounded .shadow-xl .object-contain v-click}

layout: cover

ML [Deployment]{.gradient-text}

(Congrats! 🎉 Many projects don't get this far)

hideInToc: true

What does it [mean]{.gradient-text}?

We are getting [value]{v-mark.highlight.yellow=1} from our models

How do we go about for [content moderation]{.gradient-text}? How are we using the model?

::left::

{.h-60 .w-60 .object-cover .rounded-xl .shadow}

::right::

{.h-60 .w-60 .object-cover .rounded-xl .shadow}

Batch vs. real time

::left::

Batch 🍪

$\approx$ scheduled

Accumulate predictions and run them together
Schedule runs every hour/day/week/month
Write the predictions to a table to a dashboard

{.scale-150}

::right::

Real time 👟

$\approx$ event-driven

User does something
This "something" triggers a (REST) API call
Call return results/action

{.rounded-lg .shadow}

::bottom::

When should we choose one over the other?

layout: cover

Hacker News content moderation in action 🚀

(Simplified) batch deployment demo

hideInToc: true

The [tech]{.gradient-text} 🧱

{.rounded-xl.h-30.w-240.object-cover.aspect-video}

layout: cover

[Real time]{.gradient-text} deployments 🚀

And the problem of [latency]{v-mark.red=0}

hideInToc: true

Real time applications

[Serving]{.gradient-text} models

::right::

::left::

flowchart LR
  subgraph API
  	direction TB
  	cloud("☁️") <--> phone("📱")
  end
  API --> batch("🍪")
  API --> realtime("👟")

Loading

hideInToc: true

ML in production

Serving (API)

The problem of latency

hideInToc: true

Why latency?

compute available ⚖️ compute required

::left::

ML

Compute
Complexity
Size

::right::

Ops

Cold starts
Network
IO

::bottom::

graph LR
    bread("🥖") <--> bike("🚴‍♀️") <--> person("🥸")

Loading

hideInToc: true

What is latency?

“Latency is a [measurement]{v-mark.red="'+1'"} in Machine Learning to determine the performance of various models for a specific application. Latency refers to the [time taken to process one unit of data provided only one unit of data is processed at a time]{v-mark="{type:'highlight', color:'yellow', multiline:true, at:'+1'}"}.”

hideInToc: true

Latency [in action]{.gradient-text}

::left::

Stable LM from Stability AI

"ChatGPT-like"

Prompts:
- Generate a list of the 10 most beautiful cities in the world.
- How can I tell apart female and male red cardinals?

rootsacademy-model-latency/
├── ...
├── common
│   ├── __init__.py
│   └── utils.py
├── pyproject.toml
└── scripts
    ├── local.py
    └── remote.py

::right::

hideInToc: true

Latency [in action]{.gradient-text}

hideInToc: true

Latency - what can we do about it?

::left::

::right::

::bottom::

hideInToc: true

Each scaling strategies has its [trade offs]{.gradient-text}

::left::

Cloud makes it easy to scale vertically
Higher overhead to manage clusters (horizontal scaling)
Vertical scaling has limits
[Serverless machines]{.gradient-text.font-bold} can help optimize costs
Horizontal scaling can be autoscaled
Both may be cost efficient, depending on the setup

::right::

{.rounded-lg.h-60.shadow}

[What's the main cause of your latency?]{v-mark="{type:'highlight', color:'yellow', multiline:true, at:'+0'}"}

hideInToc: true

Scaling [in action]{.gradient-text}

hideInToc: true

Latency [solved]{.gradient-text}?

::left::

::right::

hideInToc: true

Instead of growing, we can [shrink]{.gradient-text}

::left::

::right::

hideInToc: true

Quantization [in action]{.gradient-text}

hideInToc: true

Scaling is not always possible on [the edge]{.gradient-text}

“Edge machine learning (edge ML) is the process of running machine learning algorithms on computing devices [at the periphery of a network]{v-mark="{type:'highlight', color:'yellow', multiline:true, at:'+1'}"} to make decisions and predictions as close as possible to the originating source of data.”

hideInToc: true

For limited resources and extremely low latencies, you may need to [look outside ML]{.gradient-text}

::left::

::right::

layout: cover

[Real time]{.gradient-text} use cases

hideInToc: true

Auto-en-joker @ dataroots

hideInToc: true

Auto-en-joker @ dataroots

hideInToc: true

Edge defect detection @ chip manufacturer

::left::

::right::

flowchart LR
  	camera("📸") --> ruler("📐")
    ruler --> pos("👍")
    ruler --> neg("👎")
    neg --> brain("🧠")
    brain --> _pos("👍")
    brain --> _neg1("👎")

Loading

layout: cover

Why MLOps?

hideInToc: true

So you have a [model]{.gradient-text}...

::right::

{.rounded .shadow-xl .object-contain v-click=1}

::left::

👨‍💼 "How long will it take to go though 100 posts? How can we make it faster?"

👷‍♀️ "How can we make sure the model scales?"

👷‍♂️ "What packages did you use?"

😡 "Why is it removing my posts?"

👩‍🔬 "What models did you already try?"

🕵️ "What data was used to train this model?"

::bottom::

[MLOps decreases the burden of deploying ML systems by establishing best practices]{v-mark.highlight.yellow=8}

Real life [testimonials]{.gradient-text}

“At this point, everybody does what they like, there is [little to no standardisation]{v-mark.red=1}. Since there are little to [no best practices]{v-mark.box.yellow=1}, the current platform contains the largest common denominator of a lot of heterogeneous projects. This causes a lot of [burden in maintaining]{v-mark.circle.pink=1} these projects”

“For quite some time, the focus was on more traditional Business Intelligence and Data Engineering. More recently we have seen the focus shifted more towards Advanced Analytics in the form of some scattered initiatives and products, which in turn lead to [little success]{v-mark.highlight.yellow=2} on these.”

“While I love our Data Science team, the [code they write is not at all up to standards]{v-mark.box.red=3} in comparison to what we normally push into production. This puts a [heavy burden]{v-mark.circle.yellow=3} on the Data Engineering team to [rewrite and refactor]{v-mark.highlight.yellow=3} this. At the same time the Data Science team is often [unhappy]{v-mark.blue=3}, because this refactoring process tends to introduce [mistakes or misunderstandings]{v-mark.highlight.cyan=3}.”

When we talk MLOps we often talk deployment and/or [deployed models]{.gradient-text}

layout: cover title: What is MLOps?

What is [ML]{v-mark.red=1}[Ops]{v-mark.circle.yellow=2}?

hideInToc: true

What's in the [name]{.gradient-text}?

::left::

Machine learning 🧠

Experimentation
Data exploration
Modelling
Hyperparameter tuning
Evaluation

::right::

Operations ⚙️

Infrastructure
Scalability
Reproducibility
Monitoring/Alerting
Automation

::bottom::

[✨ MLOps ✨]{.flex .justify-center .'-mt-20' v-click}

DevOps vs. [MLOps]{.gradient-text}

::left::

MLOps

Iterative-Incremental Development
[Automation]{v-mark.blue="'+1'"}
[Continuous Deployment]{v-mark.blue="'+1'"}
[Versioning]{v-mark.blue="'+1'"}
[Testing]{v-mark.blue="'+1'"}
[Reproducibility]{v-mark.blue="'+1'" v-mark.box.red="'+2'"}
Monitoring

::right::

vs. DevOps

+ Model

+ Features

+ Data

hideInToc: true

DevOps vs. [MLOps]{.gradient-text}

{.rounded .shadow .bg-blue .scale-50 v-click .'-mt-20' .'-mb-25'}

"[...] By codifying these practices, we hope to accelerate the adoption of ML/AI in software systems and fast delivery of intelligent software. In the following, we describe a set of important concepts in MLOps such as [Iterative-Incremental Development, Automation, Continuous Deployment, Versioning, Testing, Reproducibility, and Monitoring]{v-mark="{type:'highlight', color:'yellow', multiline:true, at:2}"}."

hideInToc: true

So... what is it?

“The [level]{v-mark.circle.blue=1} of automation of these steps defines the maturity of the ML process, which [reflects the velocity of training new models given new data or training new models given new implementations]{v-mark="{type:'highlight', color:'yellow', multiline:true, at:2}"}. The following sections describe three levels of MLOps, starting from the most common level, which involves no automation, up to automating both ML and CI/CD pipelines.”

- Google

hideInToc: true

(MLOps in theory vs. [practice]{.gradient-text})

[> “In theory, theory and practice are the same. In practice, they are not.” - Einstein]{v-click}

hideInToc: true

Pop Quiz 💥

For each of these challenges, which ones are related to [ML]{v-mark.highlight.red=1} or [Ops]{v-mark.highlight.cyan=1} ?

::left::

[Models and experiments are not properly tracked]{v-mark.highlight.cyan=2}
[Model decay]{v-mark.highlight.red=3}
[Changing business objectives]{v-mark.highlight.red=4}
[Models monitoring and (re)trainining]{v-mark.highlight.cyan=5}
[Data quality]{v-mark.highlight.red=6}
[Consistent project structure]{v-mark.highlight.cyan=7}
[Data availability]{v-mark.highlight.red=8}

::right::

[Code and dependencies tracking]{v-mark.highlight.cyan=9}
[Auditability and regulations - reproducibility and explainability]{v-mark.highlight.cyan=10}
[Wrong initial assumptions (problem definition)]{v-mark.highlight.red=11}
[Locality of the data (distributional shift)]{v-mark.highlight.red=12}
[Recreate model artifacts]{v-mark.highlight.cyan=13}
[Deploy model systems(not just one off solutions)]{v-mark.highlight.cyan=14}

layout: cover

MLOps Illustrated

ML Lifecyle Recap

hideInToc: true

ML lifecycle & development (simplified)

```mermaid %%{init: {"flowchart": {"htmlLabels": false}} }%% flowchart LR idea["`💡 Idea`"] poc["`Proof-of-Concept 🤖`"] mvp["`Minimal Viable Product 🦴`"] prod["`Iterate 🚀`"]

idea --> poc --> mvp --> prod


```mermaid
%%{init: {"flowchart": {"htmlLabels": false}} }%%
flowchart LR
    eda["`Exploratory Data Analysis 🔎`"]
    model["`Modeling 📦`"]
    eval["`Evaluation ⚖️`"]
    deploy["`Deployment 🏗️`"]
    monitor["`Monitoring 👀`"]
    eval .-> model
    eval .-> eda    
    eda --> model --> eval --> deploy --> monitor

{.absolute .top-0 .scale-110}

hideInToc: true

MLOps [Illustrated]{.gradient-text}

::left::

Data versioning 🚀
- Reproducing models and scores
Feature engineering 📦
- Version code + artifacts
Model training 🌱
- Track experiments (models hyperparameters, etc.)
- Use seeds
Quality assurance 🔍
- Unit/integration tests
- Statistical tests
- Stability tests
- GenAI tests? - Validation, self reflection, etc.
Prepare for deployment 🏗️
- Packaging and containerizing!

::right::

{.px-3}

{.rounded-lg}

hideInToc: true

Recap

How to deploy models depends on whether you have a [batch or real time]{v-mark.box.blue=1} use case
It's important to minimize [latency]{v-mark="{at:2, color:'red', brackets:['bottom'], type: 'bracket'}"} in real time use cases
Latency can be reduced by [scaling resources or reducing computational needs]{v-mark.highlight.yellow=3}
MLOps is a [set of principles]{v-mark.red=4} reduces the burden of deploying and maintaining models
Unless you're doing research, the [value]{v-mark.box.yellow=5} of models only come after they have been [deployed]{v-mark.circle.green=5}

::bottom::

Files

slides.md

Latest commit

History

slides.md

File metadata and controls

[MLOops]{v-mark.crossed-off=1} to MLOps

hideInToc: true

Agenda

layout: presenter photo: https://charlas.2023.es.pycon.org/media/avatars/blob_p6VtbO9.jpg

About me

hideInToc: true layout: default

I have worked on different [Data/AI projects]{.gradient-text}

hideInToc: true layout: twocols

From the [prototyping]{.gradient-text} side...

hideInToc: true

...to production applications

hideInToc: true

Why am I [here]{.gradient-text}?

" To help us understand what it takes for a machine learning project takes to go from [idea to production]{v-mark.box.yellow=1}, looking closely at the differences between [machine learning]{.gradient-text} and [operations]{.gradient-text} "

Use case: [content moderation]{.gradient-text}

hideInToc: true

What is [content moderation]{.gradient-text}?

hideInToc: true

So you get to [work]{.gradient-text}...

layout: cover

ML [Deployment]{.gradient-text}

(Congrats! 🎉 Many projects don't get this far)

hideInToc: true

What does it [mean]{.gradient-text}?

How do we go about for [content moderation]{.gradient-text}? How are we using the model?

Batch vs. real time

Batch 🍪

Real time 👟

layout: cover

Hacker News content moderation in action 🚀

(Simplified) batch deployment demo

hideInToc: true

The [tech]{.gradient-text} 🧱

layout: cover

[Real time]{.gradient-text} deployments 🚀

And the problem of [latency]{v-mark.red=0}

hideInToc: true

Real time applications

[Serving]{.gradient-text} models

hideInToc: true

ML in production

Serving (API)

The problem of latency

hideInToc: true

Why latency?

compute available ⚖️ compute required

ML

Ops

hideInToc: true

What is latency?

- OpenGenus

hideInToc: true

Latency [in action]{.gradient-text}

hideInToc: true

Latency [in action]{.gradient-text}

hideInToc: true

Latency - what can we do about it?

hideInToc: true

Each scaling strategies has its [trade offs]{.gradient-text}

[What's the main cause of your latency?]{v-mark="{type:'highlight', color:'yellow', multiline:true, at:'+0'}"}

hideInToc: true

Scaling [in action]{.gradient-text}

hideInToc: true

Latency [solved]{.gradient-text}?

hideInToc: true

Instead of growing, we can [shrink]{.gradient-text}

hideInToc: true

Quantization [in action]{.gradient-text}

hideInToc: true

Scaling is not always possible on [the edge]{.gradient-text}

- Edge Impulse

hideInToc: true

For limited resources and extremely low latencies, you may need to [look outside ML]{.gradient-text}

layout: cover

...to
production
applications