Evaluation harness for OpenHands V1.
☆53Mar 6, 2026Updated this week
Alternatives and similar repositories for benchmarks
Users that are interested in benchmarks are comparing it to the libraries listed below
Sorting:
- Never lose context again with a persistent, queryable memory system for AI agents and development teams.☆18Jan 29, 2026Updated last month
- Game engine for website version avalon card-board game☆12Aug 2, 2025Updated 7 months ago
- Siren: Byzantine-robust Federated Learning via Proactive Alarming (SoCC '21)☆11Mar 28, 2024Updated last year
- Elastic computing platform☆30Updated this week
- [NeurIPS 2025] GuideFlow3D: Optimization-Guided Rectified Flow For Appearance Transfer☆24Dec 1, 2025Updated 3 months ago
- [NeurIPS 23] Characterizing OOD Error via Optimal Transport☆13Nov 19, 2023Updated 2 years ago
- official implementation of RoSAS: Deep Semi-supervised Anomaly Detection with Contamination-resilient Continuous Supervision☆11Jul 18, 2023Updated 2 years ago
- ATP: Directed Graph Embedding with Asymmetric Transitivity Preservation☆10Apr 18, 2019Updated 6 years ago
- ☆21Dec 25, 2025Updated 2 months ago
- KGym - A platform to run hundreds to thousands of ML4Linux kernel experiments at scale☆14Nov 8, 2025Updated 3 months ago
- Voxel-based Editor☆13Jul 11, 2018Updated 7 years ago
- Code and results accompanying our paper titled Leveraging Unlabeled Data to Predict Out-of-Distribution Performance at ICLR 2022☆10Dec 8, 2022Updated 3 years ago
- open IPython notebooks for the book of Scientific Computing with Python☆11Jul 16, 2015Updated 10 years ago
- TopoTrans: Optimal Transport meets Topological Data Analysis☆14Apr 20, 2023Updated 2 years ago
- Continual Memorization of Factoids in Large Language Models☆12Nov 20, 2024Updated last year
- [USENIX Security 2025] SOFT: Selective Data Obfuscation for Protecting LLM Fine-tuning against Membership Inference Attacks☆20Sep 18, 2025Updated 5 months ago
- AI coding models, agents, CLIs, IDEs, AI app builders, open source tooling, benchmarks☆40Feb 24, 2026Updated last week
- Public Evaluation Result Archieve for BFCL☆27Dec 17, 2025Updated 2 months ago
- Contains examples and assignments for my CS 254 course at Vanderbilt University, which can be accessed via http://www.dre.vanderbilt.edu/…☆14Apr 25, 2022Updated 3 years ago
- Implement of Implicit Knowledge Extraction Attack.☆18May 28, 2025Updated 9 months ago
- The TacTok automated Coq proof script synthesis tool☆17Jan 9, 2024Updated 2 years ago
- Port of Super Monkey Ball 2: Sakura Edition for PSVITA.☆14Dec 31, 2025Updated 2 months ago
- [CVPR'24] LOTUS: Evasive and Resilient Backdoor Attacks through Sub-Partitioning☆15Jan 15, 2025Updated last year
- ☆18Nov 30, 2025Updated 3 months ago
- [ICLR 2026] RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling☆34Feb 25, 2026Updated last week
- Color palette and swatches for macOS's color picker.☆20Jun 9, 2020Updated 5 years ago
- 面试经验记录☆14Sep 11, 2019Updated 6 years ago
- Official repo for FSE'24 paper "CodeArt: Better Code Models by Attention Regularization When Symbols Are Lacking"☆18Mar 10, 2025Updated 11 months ago
- ☆24Aug 19, 2025Updated 6 months ago
- Official repo for "ProSec: Fortifying Code LLMs with Proactive Security Alignment"☆17Feb 26, 2026Updated last week
- Reflection library for Coq☆12Sep 26, 2019Updated 6 years ago
- GSOC 2017 - Apache Organization - # Implementation of Factorization Machines on Spark using parallel stochastic gradient descent (python…☆14Mar 26, 2017Updated 8 years ago
- ☆15Dec 29, 2023Updated 2 years ago
- Implementation of the listwise Learning to Rank algorithm described in the paper by Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Ha…☆17Jun 20, 2018Updated 7 years ago
- cytoscape in vue☆16Jan 6, 2023Updated 3 years ago
- A Socket-based group chat android app implemented in MVC, MVP, MVVM and FRP☆10Jun 24, 2019Updated 6 years ago
- ☆18Aug 15, 2022Updated 3 years ago
- Utilizing code clone detection to generate comments automatically.☆17Aug 26, 2019Updated 6 years ago
- ☆28Jan 31, 2026Updated last month