yuyq18 / StepTool
☆21Updated last month
Alternatives and similar repositories for StepTool:
Users that are interested in StepTool are comparing it to the libraries listed below
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"☆110Updated 6 months ago
- The awesome agents in the era of large language models☆60Updated last year
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆57Updated 5 months ago
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆52Updated 4 months ago
- [EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".☆73Updated 2 months ago
- A new tool learning benchmark aiming at well-balanced stability and reality, based on ToolBench.☆135Updated 3 weeks ago
- ☆43Updated 5 months ago
- ☆18Updated last year
- ☆28Updated this week
- [EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions☆107Updated 6 months ago
- Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains…☆171Updated this week
- ☆16Updated 4 months ago
- [ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style☆28Updated this week
- ☆80Updated last year
- ☆54Updated 5 months ago
- ☆39Updated 4 months ago
- Non-Autoregressive Math Word Problem Solver with Unified Tree Structure☆11Updated last year
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct☆167Updated 2 months ago
- [ACL2024] Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios☆54Updated last year
- BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval☆92Updated last month
- Official Repo for ICLR 2024 paper MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback by Xingyao Wang*, Ziha…☆119Updated 9 months ago
- Fast Memorization of Prompt Improves Context Awareness of Large Language Models (Findings of EMNLP 2024)☆20Updated 5 months ago
- ☆73Updated 10 months ago
- ☆83Updated 5 months ago
- Collection of papers for scalable automated alignment.☆87Updated 5 months ago
- ☆41Updated last year
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"☆67Updated 11 months ago
- ☆42Updated 3 weeks ago
- Code and data for "ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM" (NeurIPS 2024 Track Datasets and…☆38Updated 5 months ago
- [COLING 2025] ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios☆65Updated 3 months ago