avidml / evaluating-LLMs
Creating the tools and data sets necessary to evaluate vulnerabilities in LLMs.
β21Updated 2 years ago
Related projects β
Alternatives and complementary repositories for evaluating-LLMs
- π€ Disaggregators: Curated data labelers for in-depth analysis.β65Updated last year
- π A curated list of papers & technical articles on AI Quality & Safetyβ162Updated last year
- Find and fix bugs in natural language machine learning models using adaptive testing.β182Updated 6 months ago
- RATransformers π- Make your transformer (like BERT, RoBERTa, GPT-2 and T5) Relation Aware!β41Updated last year
- Annotated corpus + evaluation metrics for text anonymisationβ51Updated 9 months ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.β103Updated 6 months ago
- AuditNLG: Auditing Generative AI Language Modeling for Trustworthinessβ97Updated last year
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"β109Updated last year
- Code for Multilingual Eval of Generative AI paper published at EMNLP 2023β66Updated 8 months ago
- Interpretable and efficient predictors using pre-trained language models. Scikit-learn compatible.β38Updated 7 months ago
- A python package for benchmarking interpretability techniques on Transformers.β212Updated last month
- Dataset used to evaluate Skill Extraction systems based on the ESCO skills taxonomy.β12Updated 4 months ago
- A framework for few-shot evaluation of autoregressive language models.β102Updated last year
- Augmenty is an augmentation library based on spaCy for augmenting texts.β151Updated 6 months ago
- A library to synthesize text datasets using Large Language Models (LLM)β151Updated last year
- β42Updated last year
- Language Identification with Support for More Than 2000 Labels -- EMNLP 2023β92Updated 3 weeks ago
- Open source library for few shot NLPβ77Updated last year
- triple-encoders is a library for contextualizing distributed Sentence Transformers representations.β14Updated 2 months ago
- A Python library aimed at dissecting and augmenting NER training data.β57Updated last year
- Lightweight demos for finetuning LLMs. Powered by π€ transformers and open-source datasets.β64Updated last month
- The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptionsβ68Updated last year
- β75Updated last year
- π« SpaCy wrapper for ConceptNet π«β88Updated last year
- Source code and data for Like a Good Nearest Neighborβ28Updated 9 months ago
- Finding semantically meaningful and accurate prompts.β46Updated last year
- Official repository for the paper "ALERT: A Comprehensive Benchmark for Assessing Large Language Modelsβ Safety through Red Teaming"β33Updated 2 months ago
- codebase release for EMNLP2023 paper publicationβ19Updated 8 months ago
- Fiddler Auditor is a tool to evaluate language models.β171Updated 8 months ago