This is the repository for paper EscapeBench: Pushing Language Models to Think Outside the Box
☆18Dec 19, 2024Updated last year
Alternatives and similar repositories for EscapeBench
Users that are interested in EscapeBench are comparing it to the libraries listed below
Sorting:
- ☆21Sep 7, 2025Updated 5 months ago
- ☆16Apr 19, 2021Updated 4 years ago
- Code for Evaluating Explanations for Reading Comprehension with Realistic Counterfactuals.☆18Apr 25, 2021Updated 4 years ago
- ☆24Mar 1, 2025Updated last year
- Winner of Cloth Competition: ICRA 2023, ICRA 2024 - Center Direction Network for Grasping Point Localization on Cloths - IEEE Robotic…☆21Feb 2, 2026Updated last month
- ☆25May 28, 2025Updated 9 months ago
- This is the repository for paper "CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models"☆29Oct 8, 2023Updated 2 years ago
- ☆29Oct 18, 2022Updated 3 years ago
- [EMNLP 2024 Findings] Benchmarking Language Model Agents for Data-Driven Science☆34Oct 25, 2024Updated last year
- Sotopia-π: Interactive Learning of Socially Intelligent Language Agents (ACL 2024)☆81May 7, 2024Updated last year
- Sotopia-RL: Reward Design for Social Intelligence☆46Jan 29, 2026Updated last month
- CokeBERT: Contextual Knowledge Selection and Embedding towards Enhanced Pre-Trained Language Models☆32Jul 2, 2023Updated 2 years ago
- Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073☆31Jul 9, 2024Updated last year
- ☆72Oct 1, 2025Updated 5 months ago
- Code for the paper "SMACE: A New Method for the Interpretability of Composite Decision Systems", ECML 2022☆15Apr 17, 2023Updated 2 years ago
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated last year
- Enhanced Explainable Neural Network☆10Dec 25, 2021Updated 4 years ago
- ☆52Mar 18, 2025Updated 11 months ago
- ☆43May 29, 2025Updated 9 months ago
- Code Repository for ControlVLA, CoRL2025.☆85Oct 26, 2025Updated 4 months ago
- Slackから勤怠入力するSlack Botだぱっちょ☆14Jun 5, 2023Updated 2 years ago
- Deep Generative Model (Torch)☆11Apr 19, 2016Updated 9 years ago
- DREEM Relates Every Entities' Motion (DREEM). Global Tracking Transformers for biological multi-object tracking.☆13Updated this week
- Temporal summarization framework☆10Dec 4, 2023Updated 2 years ago
- ☆12Feb 24, 2026Updated last week
- Software package for intertemporal pricing optimization under reference effects and consumer heterogeneity estimation. Please see REAMDE.…☆10Mar 7, 2024Updated last year
- Official code for Conformal Isometry of Lie Group Representation in Recurrent Network of Grid Cells (NeurIPS workshop on Symmetry and Geo…☆13Nov 1, 2022Updated 3 years ago
- ☆12Feb 27, 2023Updated 3 years ago
- JSSP dataset for LLMs☆17May 29, 2025Updated 9 months ago
- Our repo containes a Efficient RGB-D features extractor to category-level and instance-level 6D pose estimation.☆14Oct 29, 2025Updated 4 months ago
- ☆10Aug 16, 2023Updated 2 years ago
- On the Robustness of GUI Grounding Models Against Image Attacks☆12Apr 8, 2025Updated 10 months ago
- Multi-resource Dynamic Coordinated Planning of Flexible Distribution Network☆15Jun 11, 2024Updated last year
- ☆10Oct 26, 2022Updated 3 years ago
- The main controller for services in the cs-insights project through docker-compose.☆13Aug 25, 2023Updated 2 years ago
- ☆11Jan 13, 2026Updated last month
- ☆10Jul 13, 2024Updated last year
- Adaptive and Robust Multi-Task Learning☆10May 19, 2024Updated last year
- OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents☆21Jan 6, 2026Updated last month