BenchBench is a Python package to evaluate multi-task benchmarks.
☆18Oct 12, 2025Updated 5 months ago
Alternatives and similar repositories for benchbench
Users that are interested in benchbench are comparing it to the libraries listed below
Sorting:
- Code to reproduce the paper "Do causal predictors generalize better to new domains?"☆15Feb 7, 2025Updated last year
- ☆33Jan 13, 2022Updated 4 years ago
- Wrap around any model to output differentially private prediction sets with finite sample validity on any dataset.☆18Mar 3, 2024Updated 2 years ago
- DualQuery: Practical Private Query Release Algorithm☆19Jul 7, 2015Updated 10 years ago
- A reliable leaderboard algorithm for machine learning competitions☆17May 19, 2015Updated 10 years ago
- Package for typesetting a book into PDF and HTML using pandoc and a bunch of other tools☆15Jul 21, 2020Updated 5 years ago
- This is the code of our work CISS Certified Robustness Against Natural Language Attacks by Causal Intervention published on ICML 2022☆11Dec 6, 2022Updated 3 years ago
- Test-time-training on nearest neighbors for large language models☆49Apr 18, 2024Updated last year
- Notebooks for managing NeurIPS 2014 and analysing the NeurIPS experiment.☆13May 22, 2024Updated last year
- Normalization Matters in Weakly Supervised Object Localization (ICCV 2021)☆11Oct 24, 2021Updated 4 years ago
- Many ASRs under one roof. With Benchmarking... answering the question. What is the best ASR for my dataset?☆19Oct 5, 2022Updated 3 years ago
- Code and results accompanying our paper titled RLSbench: Domain Adaptation under Relaxed Label Shift☆35Jul 19, 2023Updated 2 years ago
- ☆11Jun 3, 2024Updated last year
- An easy way to get latex snippets into Keynote☆22Dec 11, 2016Updated 9 years ago
- TensorFlow implementation of FAIR's InferSent (Supervised Learning of Universal Sentence Representations from Natural Language Inference …☆14Aug 6, 2018Updated 7 years ago
- Evaluate uncertainty, calibration, accuracy, and fairness of LLMs on real-world survey data!☆25Dec 14, 2025Updated 3 months ago
- ☆16Oct 16, 2023Updated 2 years ago
- AQuA: A Benchmarking Tool for Label Quality Assessment, NeurIPS'23 D&B☆23Oct 17, 2023Updated 2 years ago
- Diffusion for EEG☆11Jan 2, 2023Updated 3 years ago
- ☆13Sep 5, 2023Updated 2 years ago
- pytest helps for compare images and regression☆12Dec 31, 2024Updated last year
- SVM's for Julia☆40Jul 26, 2016Updated 9 years ago
- T480s setup☆21Jul 30, 2019Updated 6 years ago
- ☆12Jul 8, 2023Updated 2 years ago
- Code for the paper "Active Pointly-Supervised Instance Segmentation", ECCV 2022.☆10Aug 1, 2022Updated 3 years ago
- Official code for Generative Fractional Diffusion Models☆17Jan 16, 2025Updated last year
- Utility for OpenAI GPT Functions☆14Jun 25, 2023Updated 2 years ago
- Data Mining Lab 2020. The Variational Fair Autoencoder.☆15Apr 18, 2023Updated 2 years ago
- ☆18Oct 6, 2025Updated 5 months ago
- ☆42Jan 25, 2019Updated 7 years ago
- PyTorch implementation for "Generative Modeling on Manifolds Through Mixture of Riemannian Diffusion Processes" (ICML 2024).☆13Jul 21, 2024Updated last year
- [EMNLP 2024] A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners☆27Dec 11, 2024Updated last year
- code related to Shirvalkar et al. 2023 - Nature Neuroscience☆10May 22, 2023Updated 2 years ago
- ☆10Jun 13, 2021Updated 4 years ago
- Exploring item combinations with a bar chart☆10Apr 17, 2021Updated 4 years ago
- Create generated datasets and train robust classifiers☆36Sep 1, 2023Updated 2 years ago
- Models for Nature Language Inference (Tensorflow Version), including 'A Decomposable Attention Model for Natural Language Inference', ...…☆18Apr 21, 2018Updated 7 years ago
- [ICML2023] Instant Soup Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models. Ajay Jaiswal, Shiwei Liu, Ti…☆11Nov 28, 2023Updated 2 years ago
- Extract your SlidesLive presentation.☆15Apr 19, 2024Updated last year