Creating the tools and data sets necessary to evaluate vulnerabilities in LLMs.
☆27Mar 14, 2025Updated last year
Alternatives and similar repositories for evaluating-LLMs
Users that are interested in evaluating-LLMs are comparing it to the libraries listed below
Sorting:
- This repository contains code used for our Multi Sentence Inference NAACL'22 paper.☆12Mar 6, 2023Updated 3 years ago
- ☆15Aug 3, 2020Updated 5 years ago
- 9th solution☆11Oct 11, 2022Updated 3 years ago
- Official implementation of the ACL Findings 2023 paper: Interpretable Automatic Fine-grained Inconsistency Detection in Text Summarizatio…☆14Jan 25, 2024Updated 2 years ago
- ☆22Jan 5, 2024Updated 2 years ago
- 🤗 Disaggregators: Curated data labelers for in-depth analysis.☆68Feb 8, 2023Updated 3 years ago
- autoredteam: code for training models that automatically red team other language models☆15Aug 9, 2023Updated 2 years ago
- Children's Programming and Artificial Intelligence Education☆11Dec 30, 2019Updated 6 years ago
- Source code of our paper "Focus on the Target’s Vocabulary: Masked Label Smoothing for Machine Translation" @ ACL 2022☆13Apr 13, 2022Updated 3 years ago
- A library for Partially Homomorphic Encryption in Python☆12May 30, 2017Updated 8 years ago
- VisBERT: Demo web app for "How Does BERT Answer Questions?"☆11Jul 22, 2023Updated 2 years ago
- 3rd Place solution for Feedback Prize - Predicting Effective Arguments Kaggle competition☆16Sep 6, 2022Updated 3 years ago
- ☆26Nov 21, 2022Updated 3 years ago
- Transformer-based Long Document Classification☆17Nov 2, 2022Updated 3 years ago
- Evaluate Transformers from the Hub 🔥☆14Nov 27, 2023Updated 2 years ago
- Deep Just-In-Time Inconsistency Detection Between Comments and Source Code: Artifact☆23Jul 21, 2025Updated 8 months ago
- ☆18Mar 25, 2024Updated last year
- Netflix for XBMC☆61Nov 13, 2012Updated 13 years ago
- jQuery VS JS comparison table, Learn JS through jupyter notebook.☆11Sep 27, 2019Updated 6 years ago
- A simple & community made twitter bot. It generates X for Y to help you come up with a start up idea.☆14Dec 24, 2021Updated 4 years ago
- DeepDip, a DRL Gym agent that plays no-press Diplomacy in BANDANA☆13Jul 22, 2019Updated 6 years ago
- graphs from Draw.io☆14Sep 26, 2024Updated last year
- Bot that addresses typical questions about the COVID-19 virus to help you handle high volumes of questions from your customers, partners …☆12Dec 5, 2022Updated 3 years ago
- Code and dataset for the paper: Generating Literal and Implied Subquestions to Fact-check Complex Claims☆30May 30, 2023Updated 2 years ago
- a plugin for stackstorm☆14Feb 13, 2019Updated 7 years ago
- A test suite (a.k.a., dataset) with ~20k moral situations for understanding LLMs' behaviors.☆16May 5, 2023Updated 2 years ago
- A set of procedures to estimate the readability of a text☆15Apr 30, 2018Updated 7 years ago
- Benchmarking various Deep Learning models such as BERT, ALBERT, BiLSTMs on the task of sentence entailment using two datasets - MultiNLI …☆28Dec 31, 2020Updated 5 years ago
- ☆27Nov 6, 2022Updated 3 years ago
- Rasa X Jokebot Demo☆16Apr 8, 2024Updated last year
- Accompanies Finastra's Hack to the Future 4 Learning Session "Sustainability reports & NLP"☆10Mar 17, 2022Updated 4 years ago
- Code for 'Alzheimer’s Disease Classification Using Cluster-based Labelling for Graph Neural Network on Tau PET Imaging and Heterogeneous …☆12Sep 13, 2022Updated 3 years ago
- Curated list of awesome ecommerce data science resources 📊💎💪☆15Apr 19, 2019Updated 6 years ago
- Data for EMNLP 2022 paper "arXivEdits: Understanding the Human Revision Process in Scientific Writing".☆14Sep 30, 2023Updated 2 years ago
- Project repository of the paper "Less Annotating, More Classifying – Addressing the Data Scarcity Issue of Supervised Machine Learning wi…☆35Mar 19, 2024Updated 2 years ago
- By fine tuning GPT2 on News Aggregator data☆15Jan 24, 2021Updated 5 years ago
- Data labeling using few shot learning GPT-3.☆25Mar 26, 2023Updated 2 years ago
- Code for co-training large language models (e.g. T0) with smaller ones (e.g. BERT) to boost few-shot performance☆17Sep 23, 2022Updated 3 years ago
- Repository with code for MaChAmp: https://aclanthology.org/2021.eacl-demos.22/☆91Feb 12, 2026Updated last month