Quinn777 / AtomThink
☆17Updated 2 weeks ago
Alternatives and similar repositories for AtomThink:
Users that are interested in AtomThink are comparing it to the libraries listed below
- ☆34Updated 3 weeks ago
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆96Updated last month
- ☆73Updated last year
- Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"☆41Updated last month
- [preprint] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.☆43Updated 3 months ago
- Official implementation for "ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization"☆42Updated last month
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆62Updated last month
- An benchmark for evaluating the capabilities of large vision-language models (LVLMs)☆45Updated last year
- ☆22Updated 8 months ago
- ☆54Updated 5 months ago
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆41Updated 9 months ago
- ☆28Updated 5 months ago
- This is the repo for our paper "Mr-Ben: A Comprehensive Meta-Reasoning Benchmark for Large Language Models"☆47Updated 4 months ago
- Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs☆75Updated 5 months ago
- PreAct: Prediction Enhances Agent's Planning Ability (Coling2025)☆26Updated 3 months ago
- ☆36Updated 2 months ago
- The code and data for the paper JiuZhang3.0☆43Updated 10 months ago
- ☆28Updated 6 months ago
- The codebase for our EMNLP24 paper: Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Mo…☆73Updated 2 months ago
- A Self-Training Framework for Vision-Language Reasoning☆71Updated 2 months ago
- The official repository of the Omni-MATH benchmark.☆77Updated 3 months ago
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆45Updated 3 months ago
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆55Updated last month
- [MM2024, oral] "Self-Supervised Visual Preference Alignment" https://arxiv.org/abs/2404.10501☆55Updated 8 months ago
- An Easy-to-use Hallucination Detection Framework for LLMs.☆58Updated 11 months ago
- Code & Dataset for Paper: "Distill Visual Chart Reasoning Ability from LLMs to MLLMs"☆51Updated 5 months ago
- [NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs☆23Updated 6 months ago
- [NAACL 2024] A Synthetic, Scalable and Systematic Evaluation Suite for Large Language Models☆32Updated 9 months ago
- ☆106Updated last month
- Official Repository of Are Your LLMs Capable of Stable Reasoning?☆22Updated last week