BenchBench is a Python package to evaluate multi-task benchmarks.
☆21Oct 12, 2025Updated 7 months ago
Alternatives and similar repositories for benchbench
Users that are interested in benchbench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Achieve error-rate fairness between societal groups for any score-based classifier.☆19Aug 21, 2025Updated 9 months ago
- Code to reproduce the paper "Do causal predictors generalize better to new domains?"☆16Feb 7, 2025Updated last year
- ☆33Jan 13, 2022Updated 4 years ago
- Wrap around any model to output differentially private prediction sets with finite sample validity on any dataset.☆18Mar 3, 2024Updated 2 years ago
- DualQuery: Practical Private Query Release Algorithm☆19Jul 7, 2015Updated 10 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Survey of available speech datasets for Polish ASR development☆17Jan 1, 2025Updated last year
- A reliable leaderboard algorithm for machine learning competitions☆17May 19, 2015Updated 11 years ago
- Package for typesetting a book into PDF and HTML using pandoc and a bunch of other tools☆15Jul 21, 2020Updated 5 years ago
- A Medical / Clinical Note Taking Demo Application using Deepgram Voice Agent API☆15Jul 9, 2025Updated 11 months ago
- PyTorch-based library for various kinds of representational-similarity analysis☆25Jun 7, 2024Updated 2 years ago
- This is the code of our work CISS Certified Robustness Against Natural Language Attacks by Causal Intervention published on ICML 2022☆11Dec 6, 2022Updated 3 years ago
- 🎉 TrustJudge is accepted to ICLR 2026!☆47Sep 27, 2025Updated 8 months ago
- This is the way: designing and compiling LEPISZCZE, a comprehensive NLP benchmark for Polish☆14May 20, 2026Updated 3 weeks ago
- ☆11Apr 13, 2023Updated 3 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Test-time-training on nearest neighbors for large language models☆50Apr 18, 2024Updated 2 years ago
- Training a model without a dataset for natural language inference (NLI)☆25Aug 3, 2020Updated 5 years ago
- ☆23Aug 27, 2025Updated 9 months ago
- Notebooks for managing NeurIPS 2014 and analysing the NeurIPS experiment.☆13May 22, 2024Updated 2 years ago
- Normalization Matters in Weakly Supervised Object Localization (ICCV 2021)☆11Oct 24, 2021Updated 4 years ago
- The most extensive open massively multilingual corpus of datasets for training sentiment models. The corpus consists of 79 manually selec…☆16Nov 14, 2023Updated 2 years ago
- Many ASRs under one roof. With Benchmarking... answering the question. What is the best ASR for my dataset?☆19Oct 5, 2022Updated 3 years ago
- Code and results accompanying our paper titled RLSbench: Domain Adaptation under Relaxed Label Shift☆35Jul 19, 2023Updated 2 years ago
- Data and code for Natural Language Inference with Multiple Premises☆13May 15, 2019Updated 7 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- An easy way to get latex snippets into Keynote☆22Dec 11, 2016Updated 9 years ago
- Evaluate uncertainty, calibration, accuracy, and fairness of LLMs on real-world survey data!☆29Updated this week
- ☆16Oct 16, 2023Updated 2 years ago
- AQuA: A Benchmarking Tool for Label Quality Assessment, NeurIPS'23 D&B☆23Oct 17, 2023Updated 2 years ago
- Implements PyTorch model which updates SPD weights on Riemannian Manifold. Based on Huang, Z., & Van Gool, L. (2016). A Riemannian Netwo…☆12Mar 8, 2019Updated 7 years ago
- Diffusion for EEG☆11Jan 2, 2023Updated 3 years ago
- ☆14Sep 5, 2023Updated 2 years ago
- SVM's for Julia☆40Jul 26, 2016Updated 9 years ago
- Multi-Task Speech classification of accent and gender of an english speaker on Mozilla's common voice dataset☆28Updated this week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A cross-platform ChatGPT/Gemini UI (Web / PWA / Linux / Win / MacOS). 一键拥有你自己的跨平台 ChatGPT/Gemini 应用。☆16Dec 4, 2024Updated last year
- ☆13Jul 8, 2023Updated 2 years ago
- Code for the paper "Active Pointly-Supervised Instance Segmentation", ECCV 2022.☆10Aug 1, 2022Updated 3 years ago
- A random forest classifier to predict the age-group and gender of a speaker from voice measurements.☆18Apr 30, 2019Updated 7 years ago
- Utility for OpenAI GPT Functions☆14Jun 25, 2023Updated 2 years ago
- Data Mining Lab 2020. The Variational Fair Autoencoder.☆15Apr 18, 2023Updated 3 years ago
- This is the project page for paper `CAM Back Again: Large Kernel CNNs from a Weakly Supervised Object Localization Perspective`, in CVPR2…☆13Mar 19, 2024Updated 2 years ago