[EMNLP 2024] RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning
☆15May 13, 2025Updated 9 months ago
Alternatives and similar repositories for RoTBench
Users that are interested in RoTBench are comparing it to the libraries listed below
Sorting:
- [COLING 2025] ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios☆73May 13, 2025Updated 9 months ago
- This repository provides open-source code for sparse continuous distributions and corresponding Fenchel-Young losses.☆16May 10, 2023Updated 2 years ago
- A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models☆20May 24, 2025Updated 9 months ago
- ☆24Sep 2, 2024Updated last year
- ☆25Jul 15, 2023Updated 2 years ago
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated last year
- Enhanced Explainable Neural Network☆10Dec 25, 2021Updated 4 years ago
- Code for the paper "SMACE: A New Method for the Interpretability of Composite Decision Systems", ECML 2022☆15Apr 17, 2023Updated 2 years ago
- ☆10Aug 16, 2023Updated 2 years ago
- SODEN: A Scalable Continuous-Time Survival Model through Ordinary Differential Equation Networks☆14Mar 2, 2023Updated 3 years ago
- This repository reproduces the results in the paper "How expressive are transformers in spectral domain for graphs?"(published in TMLR)☆12Jul 10, 2022Updated 3 years ago
- RAG as a service to deploy your AI faster☆14Apr 14, 2025Updated 10 months ago
- This is for the AI enzyme design course☆13Nov 10, 2025Updated 4 months ago
- This is the repo for CROssBARv2 Knowledge Graph data. CROssBARv2 is a heterogeneous general-purpose biomedical KG-based system.☆11Feb 4, 2026Updated last month
- A General Quantum Software☆18Feb 24, 2026Updated last week
- ☆10Apr 15, 2022Updated 3 years ago
- ☆12Feb 27, 2023Updated 3 years ago
- Deep Generative Model (Torch)☆11Apr 19, 2016Updated 9 years ago
- Multi-resource Dynamic Coordinated Planning of Flexible Distribution Network☆15Jun 11, 2024Updated last year
- ☆11Dec 14, 2022Updated 3 years ago
- solver for discrete Mixed Observable Markov Decision Processes☆11Oct 30, 2020Updated 5 years ago
- Software package for intertemporal pricing optimization under reference effects and consumer heterogeneity estimation. Please see REAMDE.…☆10Mar 7, 2024Updated 2 years ago
- Comparing sequential forecasters via confidence sequences & e-processes☆11Oct 24, 2023Updated 2 years ago
- Official code for Conformal Isometry of Lie Group Representation in Recurrent Network of Grid Cells (NeurIPS workshop on Symmetry and Geo…☆13Nov 1, 2022Updated 3 years ago
- ☆11Jan 13, 2026Updated last month
- JSSP dataset for LLMs☆16May 29, 2025Updated 9 months ago
- ☆13Feb 4, 2025Updated last year
- ☆10Oct 26, 2022Updated 3 years ago
- Temporal summarization framework☆10Dec 4, 2023Updated 2 years ago
- Code for the paper: Improving Multi-Document Summarization through Referenced Flexible Extraction with Credit-Awareness☆12Oct 22, 2023Updated 2 years ago
- The official implementation of the paper "Large Scale Knowledge Washing"☆10Jun 12, 2024Updated last year
- Implementation of "POPCORN: Partially Observed Prediction Constrained Reinforcement Learning" (Futoma, Hughes, Doshi-Velez, AISTATS 2020)☆11May 19, 2021Updated 4 years ago
- [NAACL 2022] TreeMix: Compositional Constituency-based Data Augmentation for Natural Language Understanding☆10Jul 15, 2023Updated 2 years ago
- Code and paper repository for the ICML paper "Learning population level diffusions with generative recurrent networks"☆11May 26, 2016Updated 9 years ago
- A novel incremental hierarchical clustering algorithm (KDD 22)☆10Aug 31, 2023Updated 2 years ago
- The demo for "Sawtooth Factorial Topic Embeddings Guided Gamma Belief Network" in ICML 2021☆11Aug 17, 2021Updated 4 years ago
- Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting☆12Mar 24, 2023Updated 2 years ago
- Official implementation of "Physics-Informed Long-Sequence Forecasting From Multi-Resolution Spatiotemporal Data".☆11Dec 12, 2022Updated 3 years ago
- Custom graph/network/multi-weighted network class based on storing list of neighbors for each nodes (as opposed to edge list) for scalabl…☆12Jan 18, 2024Updated 2 years ago