A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
-
Updated
Jul 3, 2024 - Python
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Seamlessly integrate state-of-the-art transformer models into robotics stacks
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Fast format for datasets
Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.
Build real-time multimodal AI applications 🤖🎙️📹
ms-swift: Use PEFT or Full-parameter to finetune 250+ LLMs or 40+ MLLMs. (Qwen2, GLM4, Internlm2, Yi, Llama3, Llava, MiniCPM-V, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)
Visualize streams of multimodal data. Fast, easy to use, and simple to integrate. Built in Rust using egui.
Data Infrastructure for Multimodal AI: Data, models, and orchestration in a unified declarative interface.
autoupdate paper list
The Enterprise-Grade Production-Ready Multi-Agent Orchestration Framework Join our Community: https://discord.com/servers/agora-999382051935506503
Multimodal Graph Learning: how to encode multiple multimodal neighbors with their relations into LLMs
Multimodal prototyping for cancer survival prediction - ICML 2024
Corpus of resources for multimodal machine learning with physiological signals
Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector DB, and RAG.
'Talk to your slide deck' (Multimodal RAG) using foundation models (FMs) hosted on Amazon Bedrock and Amazon SageMaker
Docker image for LLaVA: Large Language and Vision Assistant
Turn your screen into actions (using LLMs). Inspired by adept.ai, rewind.ai, Apple Shortcut. Rust + WASM.
Repository contains LinkedIn posts about Generative AI knowledge sharing, learning resources and research explanations.
Add a description, image, and links to the multimodal topic page so that developers can more easily learn about it.
To associate your repository with the multimodal topic, visit your repo's landing page and select "manage topics."