☆35May 24, 2025Updated 10 months ago
Alternatives and similar repositories for StepTool
Users that are interested in StepTool are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- FamilyTool benchmark☆13Sep 10, 2025Updated 7 months ago
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆54Jun 6, 2025Updated 10 months ago
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆28Aug 9, 2025Updated 8 months ago
- (ACL 2025) Divide-Then-Aggregate: An Efficient Tool Learning Method via Parallel Tool Invocation☆12May 21, 2025Updated 10 months ago
- A new tool learning benchmark aiming at well-balanced stability and reality, based on ToolBench.☆225Apr 15, 2025Updated 11 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ICLR 2025] "Noisy Test-Time Adaptation in Vision-Language Models"☆16Feb 22, 2025Updated last year
- (ACL2025 Findings) Official code for the paper "STeCa: Step-level Trajectory Calibration for LLM Agent Learning"☆26Mar 2, 2026Updated last month
- ☆28Jul 11, 2024Updated last year
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆202Apr 17, 2025Updated 11 months ago
- A dataset for training and evaluating LLMs on decision making about "when (not) to call" functions☆58Apr 29, 2025Updated 11 months ago
- ☆38May 2, 2024Updated last year
- A Pytorch implementation of Collaborative Metric Learning (CML)☆11Oct 13, 2020Updated 5 years ago
- ☆15Feb 21, 2024Updated 2 years ago
- ☆35Jul 13, 2023Updated 2 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- [NeurIPS D&B Track 2024] Source code for the paper "Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence Challenge…☆25May 2, 2025Updated 11 months ago
- Resources and paper list for 'Scaling Environments for Agents'. This repository accompanies our survey on how environments contribute to …☆65Jan 28, 2026Updated 2 months ago
- Codes for Mitigating Unhelpfulness in Emotional Support Conversations with Multifaceted AI Feedback (ACL 2024 Findings)☆16Jul 2, 2024Updated last year
- Code for the paper "Self-Detoxifying Language Models via Toxification Reversal" (EMNLP 2023)☆18Oct 17, 2023Updated 2 years ago
- [EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".☆83Jan 14, 2025Updated last year
- ☆173Oct 29, 2025Updated 5 months ago
- [ICLR2026] "Co-rewarding: Stable Self-supervised RL for Eliciting Reasoning in Large Language Models"☆30Feb 4, 2026Updated 2 months ago
- [NAACL 2024] Making Language Models Better Tool Learners with Execution Feedback☆43Mar 14, 2024Updated 2 years ago
- Mixture-of-Basis-Experts for Compressing MoE-based LLMs☆32Dec 24, 2025Updated 3 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆13Jan 14, 2026Updated 3 months ago
- The repository for ACL 2024 paper "TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models"☆34Jun 29, 2024Updated last year
- A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward model…☆67Jun 13, 2025Updated 10 months ago
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆149Oct 27, 2024Updated last year
- [TMLR] Triple Preference Optimization☆30Feb 19, 2025Updated last year
- [NeurIPS 2024 D&B Track] GTA: A Benchmark for General Tool Agents☆138Feb 16, 2026Updated last month
- Modified LLaVA framework for MOSS2, and makes MOSS2 a multimodal model.☆13Sep 19, 2024Updated last year
- ☆49Jul 31, 2025Updated 8 months ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆161Oct 30, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆32May 8, 2025Updated 11 months ago
- A curated list of cutting-edge research papers and resources on Long Chain-of-Thought (CoT) Reasoning with Tools.☆47Dec 17, 2025Updated 3 months ago
- Multi-turn RL framework for aligning models to be tutors instead of answerers. EMNLP 2025 Oral☆34Dec 11, 2025Updated 4 months ago
- ☆13May 13, 2025Updated 11 months ago
- Source code for our paper: "LoGU: Long-form Generation with Uncertainty Expressions".☆17May 27, 2025Updated 10 months ago
- 武大信图抢座程序 支持后台持续监测,抢靠窗、有电脑的座位 以及抢座成功后自动关机☆15Dec 8, 2022Updated 3 years ago
- ☆85Updated this week