GAIR-NLP / AgencyBenchView external linksLinks
AgencyBench: Benchmarking the Frontiers of Autonomous Agents in 1M-Token Real-World Contexts
☆50Jan 23, 2026Updated 3 weeks ago
Alternatives and similar repositories for AgencyBench
Users that are interested in AgencyBench are comparing it to the libraries listed below
Sorting:
- ☆45Jan 31, 2026Updated 2 weeks ago
- Official Repo for Paper: "Reward Auditor: Inference on Reward Modeling Suitability in Real-World Perturbed Scenarios"☆31Jan 24, 2026Updated 3 weeks ago
- Benchmark dataset for the paper "Towards Next-Generation Recommender Systems: A Benchmark for Personalized Recommendation Assistant with …☆23May 20, 2025Updated 8 months ago
- Reusable components for AI coding agents: skills, subagents, MCP servers, and extensions.☆26Feb 6, 2026Updated last week
- Templates and examples for ACL and EMNLP conference posters.☆14Oct 5, 2024Updated last year
- Transcripts of Democratic Debates as R Package☆10Jun 17, 2020Updated 5 years ago
- ☆21Dec 15, 2025Updated last month
- A scalable benchmark for state representation learning in visual reinforcement learning.☆16Jun 23, 2025Updated 7 months ago
- Download UKB bulk data☆12Jul 27, 2020Updated 5 years ago
- Visualization of WhatsApp chat history data☆10Jan 31, 2016Updated 10 years ago
- ☆12Sep 23, 2024Updated last year
- ☆23Oct 31, 2025Updated 3 months ago
- Code for EMNLP2023 paper "MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter".☆12Dec 27, 2023Updated 2 years ago
- Vertebral-level CT/X-ray registration through joint 3D Radiative Gaussians (RadGS) reconstruction and 3D/3D registration.☆25Oct 18, 2025Updated 3 months ago
- [ACL 2023] To Copy Rather Than Memorize: A Vertical Learning Paradigm for Knowledge Graph Completion☆13Feb 3, 2023Updated 3 years ago
- Short course using RStudio for biological data analysis☆14Jul 7, 2022Updated 3 years ago
- ☆13Jul 14, 2024Updated last year
- ☆16Jan 5, 2025Updated last year
- Table logger using Rich☆13Aug 13, 2025Updated 6 months ago
- Official resources of "The First Few Tokens Are All You Need: An Efficient and Effective Unsupervised Prefix Fine-Tuning Method for Reaso…☆16Jun 13, 2025Updated 8 months ago
- ☆14Feb 26, 2025Updated 11 months ago
- [CVPR' 25] Official repo for From Head to Tail: Towards Balanced Representation in Large Vision-Language Models through Adaptive Data Cal…☆21Jun 6, 2025Updated 8 months ago
- An explainable AI system that combines Graph Intelligence, Vector Search, and Retrieval-Augmented Generation (RAG) to deliver grounded an…☆24Nov 1, 2025Updated 3 months ago
- LangChain + llamaCPP + babyAGI implementation☆13Apr 12, 2023Updated 2 years ago
- Exercises and resources for the AI Coding Summit Context Engineering remote workshop!☆24Oct 16, 2025Updated 3 months ago
- Official PyTorch code for ICLR 2025 paper "Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Models"☆24Mar 4, 2025Updated 11 months ago
- A MATLAB package for multi-modal voxel-wise brain image analysis☆15May 16, 2023Updated 2 years ago
- NeuroElf (MATLAB)☆18Jan 19, 2025Updated last year
- An all in one Launchy plugin.☆16Mar 17, 2021Updated 4 years ago
- Mutable dynamic data structures for R☆18Jul 16, 2025Updated 6 months ago
- ☆40Nov 8, 2025Updated 3 months ago
- Code and data repository for "The Mirage of Model Editing: Revisiting Evaluation in the Wild"☆16Aug 27, 2025Updated 5 months ago
- Building an Intelligent AWS Cloud Engineer Agent with Strands Agents SDK☆23Dec 16, 2025Updated last month
- Implementation of Recursive Language Model paper from scratch☆31Feb 4, 2026Updated last week
- One tiny lib for LLM token + cost math☆30Jan 16, 2026Updated 3 weeks ago
- SvelteKit (svelte v5) + Tauri V2 + FastAPI Template☆21Jul 23, 2025Updated 6 months ago
- Multiscale Atlas of Gene expression for Integrative Cortical Cartography☆15Feb 8, 2024Updated 2 years ago
- ✨ CocoIndex Claude Code Skill ✨☆40Jan 31, 2026Updated 2 weeks ago
- Visualization in Python: an overview☆17Jul 31, 2018Updated 7 years ago