Junjie-Ye / RoTBenchView external linksLinks
[EMNLP 2024] RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning
☆15May 13, 2025Updated 9 months ago
Alternatives and similar repositories for RoTBench
Users that are interested in RoTBench are comparing it to the libraries listed below
Sorting:
- [COLING 2025] ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios☆73May 13, 2025Updated 9 months ago
- A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models☆19May 24, 2025Updated 8 months ago
- This repository provides open-source code for sparse continuous distributions and corresponding Fenchel-Young losses.☆16May 10, 2023Updated 2 years ago
- ☆25Jul 15, 2023Updated 2 years ago
- ☆24Sep 2, 2024Updated last year
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated last year
- Enhanced Explainable Neural Network☆10Dec 25, 2021Updated 4 years ago
- Code for the paper "SMACE: A New Method for the Interpretability of Composite Decision Systems", ECML 2022☆15Apr 17, 2023Updated 2 years ago
- Software package for intertemporal pricing optimization under reference effects and consumer heterogeneity estimation. Please see REAMDE.…☆10Mar 7, 2024Updated last year
- JSSP dataset for LLMs☆17May 29, 2025Updated 8 months ago
- ☆11Jan 13, 2026Updated last month
- solver for discrete Mixed Observable Markov Decision Processes☆11Oct 30, 2020Updated 5 years ago
- ☆10Aug 16, 2023Updated 2 years ago
- ☆12Feb 27, 2023Updated 2 years ago
- Deep Generative Model (Torch)☆11Apr 19, 2016Updated 9 years ago
- Multi-resource Dynamic Coordinated Planning of Flexible Distribution Network☆15Jun 11, 2024Updated last year
- ☆10Apr 15, 2022Updated 3 years ago
- This is for the AI enzyme design course☆13Nov 10, 2025Updated 3 months ago
- A General Quantum Software☆17Dec 11, 2025Updated 2 months ago
- ☆11Dec 14, 2022Updated 3 years ago
- Comparing sequential forecasters via confidence sequences & e-processes☆11Oct 24, 2023Updated 2 years ago
- ☆10Oct 26, 2022Updated 3 years ago
- ☆13Feb 4, 2025Updated last year
- RAG as a service to deploy your AI faster☆14Apr 14, 2025Updated 10 months ago
- Official code for Conformal Isometry of Lie Group Representation in Recurrent Network of Grid Cells (NeurIPS workshop on Symmetry and Geo…☆13Nov 1, 2022Updated 3 years ago
- Temporal summarization framework☆10Dec 4, 2023Updated 2 years ago
- This repository reproduces the results in the paper "How expressive are transformers in spectral domain for graphs?"(published in TMLR)☆12Jul 10, 2022Updated 3 years ago
- Code for the paper: Improving Multi-Document Summarization through Referenced Flexible Extraction with Credit-Awareness☆12Oct 22, 2023Updated 2 years ago
- This is the repo for CROssBARv2 Knowledge Graph data. CROssBARv2 is a heterogeneous general-purpose biomedical KG-based system.☆11Feb 4, 2026Updated last week
- SODEN: A Scalable Continuous-Time Survival Model through Ordinary Differential Equation Networks☆14Mar 2, 2023Updated 2 years ago
- Adaptive and Robust Multi-Task Learning☆10May 19, 2024Updated last year
- ☆18Oct 6, 2025Updated 4 months ago
- On the Robustness of Graph Neural Diffusion to Topology Perturbations☆16Nov 4, 2022Updated 3 years ago
- Repository for AAAI'22 paper "MLink: Linking Black-Box Models for Collaborative Multi-Model Inference".☆11Oct 25, 2023Updated 2 years ago
- Our paper is titled "NUS-IDS at FinCausal 2021: Dependency Tree in Graph Neural Networks for better Cause-Effect Span Detection".☆13Feb 11, 2022Updated 4 years ago
- Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting☆12Mar 24, 2023Updated 2 years ago
- The official implementation of the paper **LVChat: Facilitating Long Video Comprehension**☆14Apr 15, 2024Updated last year
- DICE: Detecting In-distribution Data Contamination with LLM's Internal State☆11Sep 21, 2024Updated last year
- Code for the paper "Deep FTRL-ORW: An Efficient Deep Reinforcement Learning Algorithm for Solving Imperfect Information Extensive-Form Ga…☆11Dec 1, 2022Updated 3 years ago