PKU-YuanGroup / LLaVA-o1
☆56Updated 5 months ago
Alternatives and similar repositories for LLaVA-o1:
Users that are interested in LLaVA-o1 are comparing it to the libraries listed below
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement☆76Updated last month
- ☆32Updated 3 months ago
- OLA-VLM: Elevating Visual Perception in Multimodal LLMs with Auxiliary Embedding Distillation, arXiv 2024☆58Updated 2 months ago
- From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation☆89Updated last month
- ☆91Updated 3 weeks ago
- This project is a collection of fine-tuning scripts to help researchers fine-tune Qwen 2 VL on HuggingFace datasets.☆65Updated 7 months ago
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆96Updated 6 months ago
- OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation☆72Updated last month
- Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆90Updated last month
- Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs☆81Updated 6 months ago
- ☆61Updated 9 months ago
- ☆75Updated last month
- official repository for “Reinforcement Learning for Reasoning in Large Language Models with One Training Example”☆52Updated this week
- FuseAI Project☆85Updated 3 months ago
- Code for ScribeAgent paper☆57Updated 2 months ago
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.☆91Updated 4 months ago
- This is the repo for the paper "PANGEA: A FULLY OPEN MULTILINGUAL MULTIMODAL LLM FOR 39 LANGUAGES"☆105Updated 5 months ago
- ☆92Updated 3 months ago
- ☆73Updated this week
- Tina: Tiny Reasoning Models via LoRA☆164Updated last week
- ☆26Updated last month
- ☆24Updated 7 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆172Updated 3 months ago
- ☆150Updated 2 months ago
- Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmind☆49Updated 3 months ago
- The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"☆152Updated last month
- ☆63Updated last month
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆186Updated 3 weeks ago
- [ICLR 2025] A trinity of environments, tools, and benchmarks for general virtual agents☆201Updated 2 weeks ago
- ☆57Updated last week