☆23Jan 8, 2024Updated 2 years ago
Alternatives and similar repositories for CVLM
Users that are interested in CVLM are comparing it to the libraries listed below
Sorting:
- ☆19Dec 6, 2023Updated 2 years ago
- ☆88Jul 4, 2024Updated last year
- Lion: Kindling Vision Intelligence within Large Language Models☆51Jan 25, 2024Updated 2 years ago
- Large Multimodal Model☆15Apr 8, 2024Updated last year
- ☆18May 14, 2024Updated last year
- ☆21Feb 29, 2024Updated 2 years ago
- ☆134Dec 22, 2023Updated 2 years ago
- Code for "DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets", accepted at Neurips 2023 (Main confer…☆27Mar 29, 2024Updated last year
- ☆29May 13, 2024Updated last year
- A collection of visual instruction tuning datasets.☆76Mar 14, 2024Updated last year
- ☆37Jul 9, 2024Updated last year
- Repository of paper: Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models☆37Sep 19, 2023Updated 2 years ago
- [ICLR2025] LLaVA-HR: High-Resolution Large Language-Vision Assistant☆246Aug 14, 2024Updated last year
- Building a multi-agent RAG system with advanced RAG methods☆12Jan 12, 2025Updated last year
- Twinkle✨: Training workbench to make your model glow.☆45Updated this week
- ☆115Updated this week
- Azure Machine Learning - MLOps Python SDKv2☆10Jul 24, 2023Updated 2 years ago
- A simple exam generator and grader written in Python with OpenCV☆14Jan 14, 2026Updated last month
- Modern normalizing flows in Python. Simple to use and easily extensible.☆12Feb 11, 2026Updated 3 weeks ago
- [EMNLP 2025] WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning☆75Nov 4, 2025Updated 4 months ago
- Official repository of MMDU dataset☆104Sep 29, 2024Updated last year
- Code/Data for the paper: "LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding"☆269Jun 12, 2024Updated last year
- VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection☆25May 31, 2025Updated 9 months ago
- [ICML 2025] LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models☆17Nov 4, 2025Updated 4 months ago
- ☆28Jan 5, 2026Updated last month
- Image dataset augmentation for machine learning☆14Jun 8, 2023Updated 2 years ago
- Various test models in WNNX format. It can view with `pip install wnetron && wnetron`☆12Jun 22, 2022Updated 3 years ago
- ☆12Nov 22, 2022Updated 3 years ago
- ☆11Dec 19, 2023Updated 2 years ago
- mouse pet-ct image segmentation☆12Feb 19, 2023Updated 3 years ago
- Augment line images for improving OCR datasets☆10Oct 4, 2023Updated 2 years ago
- ✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM☆11Jun 16, 2025Updated 8 months ago
- A lightweight, production-ready C++ library for LLM tokenization, fully compatible with HuggingFace tokenizer.json.☆24Jan 4, 2026Updated 2 months ago
- Image Tokenizer Needs Post-Training☆24Oct 4, 2025Updated 5 months ago
- A scalable data preprocessing framework built on PySpark for LLM training☆22Dec 9, 2025Updated 2 months ago
- Surrogate Modeling of the Aerodynamic Performance for Transonic Regime☆13Feb 12, 2024Updated 2 years ago
- Ground-Aware Point Cloud Semantic Segmentation for Autonomous Driving. ACM Multimedia 2019.☆12Sep 19, 2019Updated 6 years ago
- Sora 的中文指南🔥,Sora 中文调教指南,指令指南,应用开发指南,精选资源清单,Sora 开发者精选工具框架 🚀☆17Updated this week
- Python solutions to coding questions in Leetcode☆13Sep 12, 2020Updated 5 years ago