fairyshine / Seal-ToolsView external linksLinks
The source code and dataset mentioned in the paper Seal-Tools: Self-Instruct Tool Learning Dataset for Agent Tuning and Detailed Benchmark.
☆53Nov 5, 2024Updated last year
Alternatives and similar repositories for Seal-Tools
Users that are interested in Seal-Tools are comparing it to the libraries listed below
Sorting:
- [COLING 2025] NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models☆18Jan 18, 2025Updated last year
- Official implementation of the paper "From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large L…☆53Jun 24, 2024Updated last year
- Companion code to https://arxiv.org/abs/2402.15491☆21Sep 18, 2025Updated 4 months ago
- Implementation of IceFormer: Accelerated Inference with Long-Sequence Transformers on CPUs (ICLR 2024).☆25Jul 15, 2025Updated 7 months ago
- Hammer: Robust Function-Calling for On-Device Language Models via Function Masking☆113Jun 13, 2025Updated 8 months ago
- Companion code to https://arxiv.org/abs/2409.03797v2☆19Sep 18, 2025Updated 4 months ago
- ☆18Mar 19, 2023Updated 2 years ago
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆65Oct 18, 2024Updated last year
- Code for Robust Fine-tuning (RbFT)☆17Jan 31, 2025Updated last year
- Follow the Wisdom of the Crowd: Effective Text Generation via Minimum Bayes Risk Decoding☆20Nov 16, 2022Updated 3 years ago
- The code for "MoPE: Mixture of Prefix Experts for Zero-Shot Dialogue State Tracking"☆19Jan 25, 2025Updated last year
- [ACL2024] T-Eval: Evaluating Tool Utilization Capability of Large Language Models Step by Step☆304Apr 3, 2024Updated last year
- 🍼 Official implementation of Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts☆41Sep 29, 2024Updated last year
- ☆17May 17, 2022Updated 3 years ago
- ☆53Oct 10, 2024Updated last year
- ☆20Sep 2, 2024Updated last year
- EMNLP'2023 (Findings): Large Language Model Is Not a Good Few-shot Information Extractor, but a Good Reranker for Hard Samples!☆47Apr 12, 2024Updated last year
- 🎮 A toolkit for Relation Extraction and more...☆24May 8, 2025Updated 9 months ago
- Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation☆90Nov 13, 2024Updated last year
- This is the repository for paper "CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models"☆29Oct 8, 2023Updated 2 years ago
- [ICLR 2024] MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use☆108Mar 21, 2024Updated last year
- ☆317Mar 26, 2024Updated last year
- Source code for GreaTer ICLR 2025 - Gradient Over Reasoning makes Smaller Language Models Strong Prompt Optimizers☆34Apr 18, 2025Updated 9 months ago
- 记录NLP、CV、搜索、推荐等AI岗位最新情况。☆28Mar 17, 2023Updated 2 years ago
- Conic10K: A large-scale dataset for closed-vocabulary math problem understanding. Accepted to EMNLP2023 Findings.☆31Dec 6, 2023Updated 2 years ago
- For the new students who just join a NLP group☆27Nov 4, 2017Updated 8 years ago
- [ICML'24] TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic Tasks☆31Sep 20, 2024Updated last year
- URS Benchmark: Evaluating LLMs on User Reported Scenarios☆30May 30, 2025Updated 8 months ago
- Dateset Reset Policy Optimization☆31Apr 12, 2024Updated last year
- Codebase for Instruction Following without Instruction Tuning☆36Sep 24, 2024Updated last year
- [ACL 2024] Learning to Edit: Aligning LLMs with Knowledge Editing☆37Aug 19, 2024Updated last year
- An open-source conversational language model developed by the Knowledge Works Research Laboratory at Fudan University.☆64Oct 12, 2023Updated 2 years ago
- [ACL2024] Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios☆69Aug 5, 2025Updated 6 months ago
- ☆123Jun 6, 2024Updated last year
- [ACM MM2025] The official repository for the RealSyn dataset☆40Dec 14, 2025Updated 2 months ago
- X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests☆78Feb 7, 2026Updated last week
- Official implementation for DenseMixer: Improving MoE Post-Training with Precise Router Gradient☆66Aug 3, 2025Updated 6 months ago
- ☆30Apr 7, 2022Updated 3 years ago
- Complex Function Calling Benchmark.☆164Jan 20, 2025Updated last year