IDEA-CCNL / Real-Gemini
Real-time video understanding and interaction through text,audio,image and video with large multi-modal model. 利用多模态大模型的实时视频理解和交互框架,通过文本、语音、图像和视频和这是世界进行问答和交流。
☆23Updated last year
Alternatives and similar repositories for Real-Gemini:
Users that are interested in Real-Gemini are comparing it to the libraries listed below
- GPT+神器,简单实用的一站式AGI架构,内置本地化,LLM模型,agent,矢量数据库,智能链chain☆48Updated last year
- Luann (fka TypeAgent) allows you to create many LLM based agent(Various types of agent,scale up)☆21Updated last week
- 最简易的R1结果在小模型上的复现,阐述类O1与DeepSeek R1最重要的本质。Think is all your need。利用实验佐证,对于强推理能力,think思考过程性内容是AGI/ASI的核心。☆45Updated 3 months ago
- 本项目是关于Yi的多模态系列模型,如Yi-VL-6B/34B等的实验与应用。☆13Updated last year
- AGM阿格姆:AI基因图谱模型,从token-weight权重微粒角度,探索AI模型,GPT\LLM大模型的内在运作机制。☆28Updated last year
- A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.☆37Updated 8 months ago
- 中文版hf-alignment-handbook,大模型全套sft、dpo、orpo、cpt训练教程.☆11Updated 8 months ago
- An open-source chat text to control actions agentic workflow framework/showcase powered by Agently AI application development framework.☆27Updated 7 months ago
- 🎮Manipulates mobile phones just like how you would. Official code for "MobA: A Two-Level Agent System for Efficient Mobile Task Automati…☆22Updated 3 weeks ago
- 基于qwenvl微调一个多模态Xray识别的大模型☆16Updated 6 months ago
- [ICLR'25] ApolloMoE: Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts☆40Updated 5 months ago
- Tracking the hot Github repos and update daily 每天自动追踪Github热门项目☆49Updated this week
- AgentParse is a high-performance parsing library designed to map various structured data formats (such as Pydantic models, JSON, YAML, an…☆13Updated 2 weeks ago
- My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't rel…☆13Updated last year
- autonomous agent with access to a tool library☆34Updated last month
- A minimal LLM sales agent framework for sales agent fast deployment and benchmark. Support OpenAI models, Claude, HuggingFace models, Gem…☆17Updated 8 months ago
- Simple Implementation of TinyGPTV in super simple Zeta lego blocks☆16Updated 5 months ago
- A highly contextualized retrieval system integrating Large Language Models (LLMs), embeddings, and a dynamic agent-driven framework. Supp…☆20Updated 3 months ago
- 基于youtube、bilibili等视频平台、webpage网页等,利用零一万物大模型或ollama本地小模型构建大语言模型高质量训练数据集(计划支持可自定义输出的训练数据格式)☆18Updated last year
- LLM RAG 应用,支持 API 调用,语音交互。☆11Updated 10 months ago
- Built on the robust XTuner backend framework, XTuner Chat GUI offers a user-friendly platform for quick and efficient local model inferen…☆13Updated last year
- ☆17Updated 3 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆29Updated this week
- a tool for gerenate dataset from doc☆12Updated last month
- LinChance Fine-tuning System 采用 Streamlit 结合 LLaMA-Factory 打造的模型微调 Web UI☆14Updated last year
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆23Updated 7 months ago
- the newest version of llama3,source code explained line by line using Chinese☆22Updated last year
- aigc evals☆10Updated last year
- A new novel multi-modality (Vision) RAG architecture☆27Updated 7 months ago
- 使用FastAPI+vLLM部署Qwen2.5☆14Updated 7 months ago