infinigence / Infini-Megrez-OmniLinks
☆233Updated 4 months ago
Alternatives and similar repositories for Infini-Megrez-Omni
Users that are interested in Infini-Megrez-Omni are comparing it to the libraries listed below
Sorting:
- ☆310Updated 6 months ago
- GLM Series Edge Models☆142Updated last week
- Baichuan-Omni: Towards Capable Open-source Omni-modal LLM 🌊☆267Updated 4 months ago
- ☆161Updated 4 months ago
- This is a user guide for the MiniCPM and MiniCPM-V series of small language models (SLMs) developed by ModelBest. “面壁小钢炮” focuses on achi…☆247Updated 7 months ago
- ☆200Updated 8 months ago
- ☆407Updated last month
- GPT-4o-level, real-time spoken dialogue system.☆334Updated 4 months ago
- Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.☆242Updated 3 months ago
- ✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM☆322Updated 3 weeks ago
- GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation☆199Updated last week
- ☆173Updated 4 months ago
- A demo built on Megrez-3B-Instruct, integrating a web search tool to enhance the model's question-and-answer capabilities.☆38Updated 6 months ago
- 基于通义千问 Qwen2.5-Omni 的实时语音对话系统,使用在线API服务,支持实时语音交互、动态语音活动检测和流式音频处理。A real-time voice conversation system based on Qwen2.5-Omni Online-API, …☆55Updated last month
- Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent☆339Updated 2 months ago
- ☆378Updated 2 weeks ago
- MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval☆187Updated last month
- 🤗 R1-AQA Model: mispeech/r1-aqa☆269Updated 2 months ago
- Ming - facilitating advanced multimodal understanding and generation capabilities built upon the Ling LLM.☆328Updated last week
- A Toolkit for Running On-device Large Language Models (LLMs) in APP☆73Updated 11 months ago
- Dolphin is a multilingual, multitask ASR model jointly trained by DataoceanAI and Tsinghua University.☆533Updated this week
- Mixture-of-Experts (MoE) Language Model☆189Updated 9 months ago
- Repo for "VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforce…☆236Updated last week
- Deep Reasoning Translation (DRT) Project☆224Updated 3 weeks ago
- [CVPR'25 highlight] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness☆378Updated last month
- 🔥🔥First-ever hour scale video understanding models☆437Updated 2 weeks ago
- 训练一个对中文支持更好的LLaVA模型,并开源训练代码和数据。☆61Updated 9 months ago
- Qwen DianJin: LLMs for the Financial Industry by Alibaba Cloud☆109Updated last month
- ☆896Updated 2 months ago
- Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities☆893Updated 2 months ago