Multi-Modal Language Modeling with Image, Audio and Text Integration, included multi-images and multi-audio in a single multiturn.
☆18Feb 20, 2024Updated 2 years ago
Alternatives and similar repositories for multimodal-LLM
Users that are interested in multimodal-LLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆47Sep 15, 2025Updated 9 months ago
- ☆16Jul 17, 2025Updated 11 months ago
- ☆31May 30, 2025Updated last year
- Code for paper "Modality Plug-and-Play: Elastic Modality Adaptation in Multimodal LLMs for Embodied AI"☆13Jan 19, 2024Updated 2 years ago
- A minimal re-implementation of orthogonal fine-tuning (OFT), a diffusion method, for LLMs. Based on nanoGPT and minLoRA.☆14Nov 17, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆13Mar 21, 2023Updated 3 years ago
- Video scrubbing with WebCodecs☆15Nov 4, 2025Updated 7 months ago
- An implement of SPEECHSPLIT☆15Sep 12, 2020Updated 5 years ago
- A Julia Package for the ACT-R Cognitive Architecture☆13Oct 30, 2025Updated 7 months ago
- ☆19May 19, 2024Updated 2 years ago
- Tutorial on how to train a custom voice recognition model using Hugging face models.☆11Jul 2, 2023Updated 2 years ago
- Official PyTorch implementation for "MMS-LLaMA: Efficient LLM-based Audio-Visual Speech Recognition with Minimal Multimodal Speech Tokens…☆48Jun 12, 2025Updated last year
- ☆10Jun 23, 2023Updated 2 years ago
- replacement of AdamW and Lion optimizer for LLMs☆13May 28, 2023Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- fast opus bindings for node and browsers☆15Feb 11, 2024Updated 2 years ago
- code for training and using chess embeddings models☆14Jun 9, 2024Updated 2 years ago
- ☆41Feb 25, 2026Updated 3 months ago
- Face recognition using Facenet☆18May 17, 2019Updated 7 years ago
- ☆20Aug 28, 2024Updated last year
- Algorithms for Policy Evaluation, Estimation of Action Values, Policy Improvement, Policy Iteration, Truncated Policy Evaluation, Truncat…☆11Apr 3, 2019Updated 7 years ago
- ☆16Nov 24, 2025Updated 6 months ago
- An extension of VirtualHome for generating and augmenting knowledge graphs☆15Oct 24, 2024Updated last year
- [Computer Speech & Language] A transformer-based spelling error correction framework for Bangla and resource scarce Indic languages☆14Aug 9, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- GPTNERMED is a language model-generated, synthetic dataset and an open neural NER model for medical entities designed for German data.☆15Oct 5, 2023Updated 2 years ago
- Syntexmex plugin for blender☆16Mar 28, 2020Updated 6 years ago
- CHiME-9 Task 1 - MCoRec baseline☆27Jan 13, 2026Updated 5 months ago
- Pre-training Intent-Aware Encoders for Zero- and Few-Shot Intent Classification☆17Jan 8, 2024Updated 2 years ago
- An Official Repo of CVPR '20 "MSeg: A Composite Dataset for Multi-Domain Segmentation"☆16Aug 23, 2020Updated 5 years ago
- Flocon - 無料で自鯖に設置できる、新世代の多機能なTRPGオンラインセッションツールです。☆16Updated this week
- An SVM model for multi-class classification of Thyroid data.☆11Dec 9, 2019Updated 6 years ago
- My configures and setup when installing a new machine.☆11Jul 30, 2023Updated 2 years ago
- LightRAG with Neo4j Example Project☆18May 19, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆12Mar 3, 2022Updated 4 years ago
- Created to house D3.js (and others as time goes by) files used at the Pittsburgh Data Visualization meetup☆15Jan 14, 2021Updated 5 years ago
- Embed Python in Unreal Engine 5.1-5.3☆24Sep 14, 2023Updated 2 years ago
- CI(詞) shows lyrics that you are currently playing on Spotify.☆12Aug 7, 2016Updated 9 years ago
- This app uses OpenAI's LLM model to answer questions about your PDF file. Upload your PDF file and ask questions about it. The app will r…☆14May 13, 2025Updated last year
- ☆10Jul 22, 2015Updated 10 years ago
- Codes for ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding [ICML 2025]]☆50Jul 22, 2025Updated 10 months ago