Skip to content
View matthieuvion's full-sized avatar

Block or report matthieuvion

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
matthieuvion/README.md
LinkedIn Badge Kaggle Badge HuggingFace Badge

Ex. data scientist @Linkfluence (Radarly). NLP, tabular data, time-series, APIs, but curious about anything, really.


Current projects : NLP, LLM fine-tuning / deployment optimization :
  • lmd_classi : Classifier 3 classes on pro-russian comments. From (small) annotated data + baseline w/ few-shot model (SetFit) to synthetic data generation, LLM (Mistral-7B / LLama3-8B) fine-tuning and multi-e5-base classifier w/ quantization for deployment (FastAPI/Docker/GCP). Full guide, ressources, benchmarks, available as organized notebooks & code (Kaggle + Hugginface + WandB)
  • lmd_viz : Ukraine War (1st year) : comments as a proxy for people engagement. Viz + crafted a 200k comments dataset from Le Monde w/ my own API (usable "as is").

Cool stuff :

  • wzkd app & match2kd : reverse engineering matchmaking score from players' features only ; using XGB & streamlit App.
  • wzlight : light wrapper around COD WZ api (discontinuated since). Also available on PyPi.

My favorite AI/ML newsletter (should read it too!) :

Data Machina : not mine, not sponsored. Technical-but-simple enough and existed way before the LLM hype.

Pinned Loading

  1. lmd_viz lmd_viz Public

    https://matthieuvion.github.io/lmd_viz/ 236k comments of Le Monde on Ukraine. A proxy to measure people' engagement. Semantic search & SBERT models testing via Sentence-Transformer / Faiss

    HTML 4

  2. wzkd wzkd Public

    Streamlit-based dashboard that collect, aggr. and visualize player's stats from Call of Duty Warzone1. Also deploy our model (cf. /match2kd repo) to predict game difficulty ("lobby kd") from game f…

    Python 4

  3. lmd-fastapi-docker lmd-fastapi-docker Public

    Containerize (Docker) and serve (FastAPI) lmd-comment classifier (ONNX).

    Python

  4. mt-leaderboard mt-leaderboard Public

    Just to crunch some stats. Reverse engineering unofficial/undocumented API for Monster Train (game) daily challenge scoreboards.

    C# 1