ADaM-BJTU / O1-CODER
AN O1 REPLICATION FOR CODING
☆325Updated 2 months ago
Alternatives and similar repositories for O1-CODER:
Users that are interested in O1-CODER are comparing it to the libraries listed below
- A series of technical report on Slow Thinking with LLM☆409Updated last week
- Large Reasoning Models☆801Updated 2 months ago
- Building Open LLM Web Agents with Self-Evolving Online Curriculum RL☆302Updated last week
- ☆473Updated last month
- Code and implementations for the paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhiheng Xi e…☆392Updated 2 months ago
- ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)☆573Updated last month
- ☆257Updated 6 months ago
- This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.☆289Updated 6 months ago
- Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"☆216Updated this week
- ☆890Updated 3 weeks ago
- ☆304Updated 5 months ago
- Official repository for ICLR 2025 paper "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient an…☆630Updated last week
- ☆319Updated 2 weeks ago
- A visuailzation tool to make deep understaning and easier debugging for RLHF training.☆145Updated last month
- [ACL 2024] AUTOACT: Automatic Agent Learning from Scratch for QA via Self-Planning☆206Updated last month
- A highly capable 2.4B lightweight LLM using only 1T pre-training data with all details.☆152Updated this week
- OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning☆106Updated last month
- [ICLR 2025] The official implementation of paper "ToolGen: Unified Tool Retrieval and Calling via Generation"☆125Updated 2 months ago
- Building a comprehensive and handy list of papers for GUI agents☆213Updated last month
- An Analytical Evaluation Board of Multi-turn LLM Agents [NeurIPS 2024 Oral]☆281Updated 9 months ago
- The related works and background techniques about Openai o1☆210Updated last month
- RewardBench: the first evaluation tool for reward models.☆505Updated this week
- ☆115Updated 8 months ago
- Towards Large Multimodal Models as Visual Foundation Agents☆182Updated 2 weeks ago