AgencyBench: Benchmarking the Frontiers of Autonomous Agents in 1M-Token Real-World Contexts
☆58Jan 23, 2026Updated last month
Alternatives and similar repositories for AgencyBench
Users that are interested in AgencyBench are comparing it to the libraries listed below
Sorting:
- Official Repo for Paper: "Reward Auditor: Inference on Reward Modeling Suitability in Real-World Perturbed Scenarios"☆31Jan 24, 2026Updated last month
- LLMRouterBench: A Massive Benchmark and Unified Framework for LLM Routing☆38Jan 30, 2026Updated last month
- Benchmark dataset for the paper "Towards Next-Generation Recommender Systems: A Benchmark for Personalized Recommendation Assistant with …☆23May 20, 2025Updated 9 months ago
- Deep Learning 2021 in School of Data Science, USTC☆12May 17, 2023Updated 2 years ago
- The code and dataset for Boundary Representation Transformer☆16Dec 8, 2025Updated 2 months ago
- ☆12Mar 22, 2025Updated 11 months ago
- Templates and examples for ACL and EMNLP conference posters.☆14Oct 5, 2024Updated last year
- A scalable benchmark for state representation learning in visual reinforcement learning.☆16Jun 23, 2025Updated 8 months ago
- official repo for `thinking with images through-self-calling`☆21Dec 28, 2025Updated 2 months ago
- [WWW '24] UnifiedSSR: A Unified Framework of Sequential Search and Recommendation☆12Feb 16, 2024Updated 2 years ago
- Code for EMNLP2023 paper "MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter".☆12Dec 27, 2023Updated 2 years ago
- ☆12Sep 23, 2024Updated last year
- Download UKB bulk data☆12Jul 27, 2020Updated 5 years ago
- Visualization of WhatsApp chat history data☆10Jan 31, 2016Updated 10 years ago
- Short course using RStudio for biological data analysis☆14Jul 7, 2022Updated 3 years ago
- Hugo theme for documenting One-Day-Only projects☆11Jun 20, 2021Updated 4 years ago
- JAX implementation of the Mistral 7b v0.1 model☆13Mar 27, 2024Updated last year
- Vertebral-level CT/X-ray registration through joint 3D Radiative Gaussians (RadGS) reconstruction and 3D/3D registration.☆26Oct 18, 2025Updated 4 months ago
- Professional desktop app for converting text to audiobooks with local TTS☆30Oct 6, 2025Updated 5 months ago
- Table logger using Rich☆13Aug 13, 2025Updated 6 months ago
- ☆24Aug 26, 2025Updated 6 months ago
- 3 experiments for Pattern Recognition course in USTC 2020fall☆10Jan 25, 2021Updated 5 years ago
- ☆12Oct 9, 2020Updated 5 years ago
- Vim plugin to copy text to Windows clipboard on WSL☆12Jan 8, 2023Updated 3 years ago
- ☆13Jul 14, 2024Updated last year
- ☆13Oct 31, 2024Updated last year
- the final homework code for the class "intelligence engineering"☆12Mar 1, 2020Updated 6 years ago
- Exercises and resources for the AI Coding Summit Context Engineering remote workshop!☆25Oct 16, 2025Updated 4 months ago
- AI agent rules: markdown files for Claude.md, ChatGPT, Copilot, Cursor, Windsurf, and more.☆22Feb 2, 2026Updated last month
- [ICML 2024] Generalizing Knowledge Graph Embedding with Universal Orthogonal Parameterization☆15May 12, 2024Updated last year
- NeuroElf (MATLAB)☆18Jan 19, 2025Updated last year
- LangChain + llamaCPP + babyAGI implementation☆13Apr 12, 2023Updated 2 years ago
- [CVPR' 25] Official repo for From Head to Tail: Towards Balanced Representation in Large Vision-Language Models through Adaptive Data Cal…☆21Jun 6, 2025Updated 9 months ago
- An explainable AI system that combines Graph Intelligence, Vector Search, and Retrieval-Augmented Generation (RAG) to deliver grounded an…☆26Nov 1, 2025Updated 4 months ago
- Official PyTorch code for ICLR 2025 paper "Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Models"☆24Mar 4, 2025Updated last year
- An all in one Launchy plugin.☆16Mar 17, 2021Updated 4 years ago
- Multiscale Atlas of Gene expression for Integrative Cortical Cartography☆15Feb 8, 2024Updated 2 years ago
- Prompt Contracts☆42Oct 19, 2025Updated 4 months ago
- All-in-one benchmarking platform for evaluating LLM.☆15Nov 12, 2025Updated 3 months ago