IDEA-CCNL / Real-Gemini
Real-time video understanding and interaction through text,audio,image and video with large multi-modal model. 利用多模态大模型的实时视频理解和交互框架,通过文本、语音、图像和视频和这是世界进行问答和交流。
☆23Updated last year
Alternatives and similar repositories for Real-Gemini:
Users that are interested in Real-Gemini are comparing it to the libraries listed below
- Luann allows you to create a LLM agent,which has complete memory module (long-term memory, short-term memory) and knowledge module(Variou…☆19Updated 3 weeks ago
- GPT+神器,简单实用的一站式AGI架构,内置本地化,LLM模型,agent,矢量数据库,智能链chain☆48Updated last year
- 最简易的R1结果在小模型上的复现,阐述类O1与DeepSeek R1最重要的本质。Think is all your need。利用实验佐证,对于强推理能力,think思考过程性内容是AGI/ASI的核心。☆42Updated 2 months ago
- ☆16Updated 2 months ago
- the newest version of llama3,source code explained line by line using Chinese☆22Updated 11 months ago
- Simple Implementation of TinyGPTV in super simple Zeta lego blocks☆16Updated 4 months ago
- AGM阿格姆:AI基因图谱模型,从token-weight权重微粒角度,探索AI模型,GPT\LLM大模型的内在运作机制。☆28Updated last year
- Tracking the hot Github repos and update daily 每天自动追踪Github热门项目☆47Updated this week
- An open-source chat text to control actions agentic workflow framework/showcase powered by Agently AI application development framework.☆27Updated 6 months ago
- AgentParse is a high-performance parsing library designed to map various structured data formats (such as Pydantic models, JSON, YAML, an…☆13Updated this week
- aigc evals☆10Updated last year
- Document for XAgent.☆17Updated last year
- LLM RAG 应用,支持 API 调用,语音交互。☆11Updated 9 months ago
- My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't rel…☆13Updated last year
- Measuring RAG solutions throughput and latency☆16Updated 8 months ago
- AgileGen: Empowering Agile-Based Generative Software Development through Human-AI Teamwork (accepted by ACM TOSEM)☆22Updated 5 months ago
- 大模型API性能指标比较 - 深入分析TTFT、TPS等关键指标☆17Updated 6 months ago
- 本项目是关于Yi的多模态系列模型,如Yi-VL-6B/34B等的实验与应用。☆13Updated last year
- ☆18Updated last month
- ☆28Updated last year
- A new novel multi-modality (Vision) RAG architecture☆25Updated 6 months ago
- (ICLR 2025) AgentRefine: Enhancing Agent Generalization through Refinement Tuning☆12Updated last month
- Here is a demo for PDF parser (Including OCR, object detection tools)☆34Updated 5 months ago
- An open source chat bot architecture for voice/vision (and multimodal) assistants, local(CPU/GPU bound) and remote(I/O bound) to run.☆35Updated this week
- A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.☆37Updated 7 months ago
- A minimal LLM sales agent framework for sales agent fast deployment and benchmark. Support OpenAI models, Claude, HuggingFace models, Gem…☆16Updated 7 months ago
- Community Open Source Implementation of GPT4o in PyTorch☆29Updated this week
- ☆36Updated last month
- A gradio webui for Andrewyng translation-agent☆29Updated 4 months ago
- ☆63Updated this week