FanbinLu / STEVE-R1Links
R1-like Computer-use Agent
☆77Updated 3 months ago
Alternatives and similar repositories for STEVE-R1
Users that are interested in STEVE-R1 are comparing it to the libraries listed below
Sorting:
- Mulberry, an o1-like Reasoning and Reflection MLLM Implemented via Collective MCTS☆1,201Updated 3 months ago
- [NeurIPS2024] Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging☆136Updated 3 months ago
- Codebase for Iterative DPO Using Rule-based Rewards☆252Updated 3 months ago
- [NeurIPS 2024] Matryoshka Query Transformer for Large Vision-Language Models☆110Updated last year
- ☆62Updated 4 months ago
- [ICML 2025] "SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator"☆249Updated last week
- [ICLR 2025] Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models☆49Updated 5 months ago
- Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models☆176Updated 8 months ago
- ☆195Updated this week
- ✨ A synthetic dataset generation framework that produces diverse coding questions and verifiable solutions - all in one framwork☆238Updated last month
- [ICML2025] Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment☆109Updated 3 weeks ago
- Official Repository of OmniCaptioner☆152Updated 2 months ago
- [ACL'25] Code for "Aligning Large Language Models to Follow Instructions and Hallucinate Less via Effective Data Filtering"☆20Updated last month
- Official code of paper "Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models"☆79Updated last month