uiuc-kang-lab / agentic-benchmarksView external linksLinks
☆49Jul 31, 2025Updated 6 months ago
Alternatives and similar repositories for agentic-benchmarks
Users that are interested in agentic-benchmarks are comparing it to the libraries listed below
Sorting:
- SysX☆34Mar 27, 2024Updated last year
- ☆12Jan 2, 2024Updated 2 years ago
- ☆19Mar 14, 2024Updated last year
- Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks☆36Nov 27, 2025Updated 2 months ago
- Repo for Llatrieval☆31Aug 21, 2024Updated last year
- Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)☆28Dec 19, 2023Updated 2 years ago
- ☆17Oct 30, 2025Updated 3 months ago
- Code for Paper (Preserving Diversity in Supervised Fine-tuning of Large Language Models)☆51May 12, 2025Updated 9 months ago
- Leveraging Base Language Models for Few-Shot Synthetic Data Generation☆40Oct 18, 2025Updated 3 months ago
- Beyond Vibe Coding. Code, Planning, Documentation and Product Management agents.☆70Jun 16, 2025Updated 7 months ago
- (ICML 2021) Implementation for S2SD - Simultaneous Similarity-based Self-Distillation for Deep Metric Learning. Paper Link: https://arxiv…☆44Sep 18, 2020Updated 5 years ago
- ☆12Aug 15, 2023Updated 2 years ago
- Extract annotated misspellings from MIMIC-III.☆13Dec 17, 2020Updated 5 years ago
- ValTown MCP Server - Execute ValTown functions from AI assistants☆15Aug 12, 2025Updated 6 months ago
- Source Code for "Improved Embeddings for Learning Prerequisite Chains" (CPSC 490 - Senior Project)☆11May 2, 2019Updated 6 years ago
- Implementation of MetaVQA.☆12Jul 3, 2021Updated 4 years ago
- Official repository for "DYPLOC: Dynamic Planning of Content Using Mixed Language Models for Opinion Text Generation"☆10May 20, 2022Updated 3 years ago
- TOD-Flow: Modeling the Structure of Task-Oriented Dialogues☆13Feb 7, 2024Updated 2 years ago
- Residual Quantization Autoencoder, used for interpreting LLMs☆14Jan 1, 2025Updated last year
- Designed to help lawyers and legal professionals find precedent fast and prepare for case negotiations by simulating trajectories☆10Oct 16, 2024Updated last year
- ☆40May 2, 2021Updated 4 years ago
- 首先将一段录像视频进行图像采集,而后分别采用单高斯法、混合高斯法,时间平均法,帧差法对运动目标进行定位,并进行不同方法的结果,后输出一个实现定位算法后的视频。☆13Nov 23, 2018Updated 7 years ago
- ☆11Sep 8, 2017Updated 8 years ago
- Implementation of the ACL Findings paper "OutFlip: Generating Examples for Unknown Intent Detection with Natural Language Attack"☆10May 24, 2021Updated 4 years ago
- Compilation of ML/AI Resources for Members of MITxHarvard Women in AI☆11Mar 28, 2022Updated 3 years ago
- RTMS sample apps☆23Feb 6, 2026Updated last week
- Code for Augment & Reduce, a scalable stochastic algorithm for large categorical distributions☆10May 16, 2018Updated 7 years ago
- The contrastive token loss function for reducing generative repetition of autoregressive neural language models.☆13May 11, 2022Updated 3 years ago
- Official code for the paper "Does CLIP's Generalization Performance Mainly Stem from High Train-Test Similarity?" (ICLR 2024)☆10Aug 26, 2024Updated last year
- [ESEC/FSE'23] Hue: A User-Adaptive Parser for Hybrid Logs☆10Aug 24, 2023Updated 2 years ago
- An implementation of Compositional Attention: Disentangling Search and Retrieval by MILA☆14Jun 1, 2022Updated 3 years ago
- [ICLR 2023] Code for our paper "Selective Annotation Makes Language Models Better Few-Shot Learners"☆109Jul 15, 2023Updated 2 years ago
- Federated reconnaissance mini-ImageNet benchmark and baseline models☆13Sep 2, 2021Updated 4 years ago
- structured attention encoder☆13Jun 6, 2018Updated 7 years ago
- Official implementation for “SafeMVDrive: Multi-view Safety-Critical Driving Video Synthesis in the Real World Domain”☆20Dec 11, 2025Updated 2 months ago
- ☆10Jun 5, 2025Updated 8 months ago
- BachDuet enables a human performer to improvise a duet counterpoint with a computer agent in real time.☆14Aug 8, 2022Updated 3 years ago
- A toolkit for testing and improving named entity recognition [ESEC/FSE'23]☆11Aug 31, 2023Updated 2 years ago
- FamilyTool benchmark☆12Sep 10, 2025Updated 5 months ago