bytarnish / AGILE
β126Updated 3 months ago
Alternatives and similar repositories for AGILE:
Users that are interested in AGILE are comparing it to the libraries listed below
- Scaling Deep Research via Reinforcement Learning in Real-world Environments.β244Updated last week
- β125Updated 3 weeks ago
- π WebThinker: Empowering Large Reasoning Models with Deep Research Capabilityβ147Updated 2 weeks ago
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoningβ171Updated last month
- AutoCoA (Automatic generation of Chain-of-Action) is an agent model framework that enhances the multi-turn tool usage capability of reasoβ¦β97Updated last month
- β381Updated this week
- β93Updated 4 months ago
- [COLING 2025] ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenariosβ65Updated 4 months ago
- A visuailzation tool to make deep understaning and easier debugging for RLHF training.β187Updated 2 months ago
- The related works and background techniques about Openai o1β221Updated 3 months ago
- On Memorization of Large Language Models in Logical Reasoningβ63Updated 3 weeks ago
- OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuningβ133Updated 4 months ago
- β267Updated 8 months ago
- β185Updated 2 months ago
- InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuningβ252Updated last year
- β135Updated 3 weeks ago
- A new tool learning benchmark aiming at well-balanced stability and reality, based on ToolBench.β142Updated last week
- β143Updated 9 months ago
- Latest Advances on Long Chain-of-Thought Reasoningβ218Updated last week
- [NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other moβ¦β358Updated 7 months ago
- R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learningβ455Updated this week
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correctβ169Updated 3 months ago
- β101Updated 4 months ago
- β51Updated 7 months ago
- β216Updated 11 months ago
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Modelsβ256Updated 7 months ago
- [ACL2024] T-Eval: Evaluating Tool Utilization Capability of Large Language Models Step by Stepβ267Updated last year
- β218Updated last year
- Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"β236Updated this week
- [ICLR 2025] The official implementation of paper "ToolGen: Unified Tool Retrieval and Calling via Generation"β136Updated 3 weeks ago