Find and fix bugs in natural language machine learning models using adaptive testing.
☆188May 7, 2024Updated last year
Alternatives and similar repositories for adaptive-testing
Users that are interested in adaptive-testing are comparing it to the libraries listed below
Sorting:
- ☆10Aug 31, 2022Updated 3 years ago
- Ludwig benchmark☆19Mar 13, 2022Updated 4 years ago
- Beyond Accuracy: Behavioral Testing of NLP models with CheckList☆2,050Jan 9, 2024Updated 2 years ago
- Generating global explanations from local ones☆11Nov 11, 2022Updated 3 years ago
- NILE : Natural Language Inference with Faithful Natural Language Explanations☆29Jun 12, 2023Updated 2 years ago
- CSS-LM: Contrastive Semi-supervised Fine-tuning of Pre-trained Language Models☆12Jul 1, 2023Updated 2 years ago
- Active Learning for Text Classification in Python☆637Mar 8, 2026Updated last week
- This repo contains the code for generating the ToxiGen dataset, published at ACL 2022.☆344Jun 17, 2024Updated last year
- Align the token outputs from Spacy and Huggingface to help understand what language structures transformers see☆43May 16, 2022Updated 3 years ago
- ☆17Nov 30, 2022Updated 3 years ago
- ☆14Jul 5, 2024Updated last year
- ☆69May 1, 2025Updated 10 months ago
- [ICSE 2023] Differentiable interpretation and failure-inducing input generation for neural network numerical bugs.☆13Jan 5, 2024Updated 2 years ago
- [COLING 2022]: CommunityLM: Probing Partisan Worldviews from Language Models☆15Jan 31, 2023Updated 3 years ago
- Boosting Synthetic Data Generation with Effective Nonlinear Causal Discovery☆17Sep 17, 2023Updated 2 years ago
- Speechflow for emotion recognition related information decomposition☆10Jul 27, 2021Updated 4 years ago
- NL-Augmenter 🦎 → 🐍 A Collaborative Repository of Natural Language Transformations☆786May 19, 2024Updated last year
- Implementation of CaiT models in TensorFlow and ImageNet-1k checkpoints. Includes code for inference and fine-tuning.☆12Jun 9, 2023Updated 2 years ago
- This repository includes a benchmark and code for the paper "Evaluating LLMs at Detecting Errors in LLM Responses".☆31Aug 18, 2024Updated last year
- ☆12Oct 17, 2023Updated 2 years ago
- Advances in Neural Information Processing Systems (NeurIPS 2021)☆23Nov 4, 2022Updated 3 years ago
- MNASNet implementation and pre-trained model in PyTorch☆10Mar 20, 2019Updated 7 years ago
- Commonsense Ability Tests☆29Mar 8, 2022Updated 4 years ago
- ☆141Oct 30, 2023Updated 2 years ago
- [Findings of EMNLP 2022] Holistic Sentence Embeddings for Better Out-of-Distribution Detection☆18Jun 14, 2023Updated 2 years ago
- Vespa application making an index of the CORD-19 dataset.☆40Jul 8, 2025Updated 8 months ago
- Self-training with Weak Supervision (NAACL 2021)☆163Jul 24, 2023Updated 2 years ago
- Röttger et al. (ACL 2021): "HateCheck: Functional Tests for Hate Speech Detection Models" - Data☆59Oct 14, 2025Updated 5 months ago
- Explainable Zero-Shot Topic Extraction☆65Aug 19, 2024Updated last year
- A library to synthesize text datasets using Large Language Models (LLM)☆152Jan 17, 2023Updated 3 years ago
- Leveraging Transfer Learning on the classic CIFAR-10 dataset by using the weights from a pre-trained VGG-16 model.☆10Nov 15, 2018Updated 7 years ago
- An Automatic DNN TrainingProblem Detection and Repair System☆20Dec 23, 2023Updated 2 years ago
- (ICML 2021) Mandoline: Model Evaluation under Distribution Shift☆30Jun 14, 2021Updated 4 years ago
- This repository contains the dataset and implementation details of the paper "An In-depth Analysis of Implicit and Subtle Hate Speech Mes…☆10May 9, 2024Updated last year
- Edo Liberty's class notes form the course Algorithms in Data Mining given in Tel Aviv University in academic years 2011-2013☆26May 20, 2022Updated 3 years ago
- Generating Training Data Made Easy☆43Jul 3, 2020Updated 5 years ago
- Distilling Model Failures as Directions in Latent Space☆48Feb 8, 2023Updated 3 years ago
- Data and Code for Paper "Reflect Not Reflex: Inference-Based Common Ground Improves Dialogue Response Quality" (EMNLP 2022)☆11Nov 28, 2022Updated 3 years ago
- A python package for benchmarking interpretability techniques on Transformers.☆215Sep 29, 2024Updated last year