avidml / evaluating-LLMs
Creating the tools and data sets necessary to evaluate vulnerabilities in LLMs.
☆23Updated last week
Alternatives and similar repositories for evaluating-LLMs:
Users that are interested in evaluating-LLMs are comparing it to the libraries listed below
- 🤗 Disaggregators: Curated data labelers for in-depth analysis.☆65Updated 2 years ago
- Tools for managing datasets for governance and training.☆82Updated 2 weeks ago
- 📚 A curated list of papers & technical articles on AI Quality & Safety☆169Updated last year
- AuditNLG: Auditing Generative AI Language Modeling for Trustworthiness☆98Updated 3 weeks ago
- A framework for few-shot evaluation of autoregressive language models.☆102Updated last year
- Stanford CRFM's initiative to assess potential compliance with the draft EU AI Act☆93Updated last year
- Command Line Interface for Hugging Face Inference Endpoints☆67Updated 10 months ago
- A Python library aimed at dissecting and augmenting NER training data.☆58Updated last year
- Find and fix bugs in natural language machine learning models using adaptive testing.☆181Updated 9 months ago
- ☆13Updated last year
- Source code and data for Like a Good Nearest Neighbor☆28Updated last month
- Annotated corpus + evaluation metrics for text anonymisation☆54Updated last year
- Experiments on including metadata such as URLs, timestamps, website descriptions and HTML tags during pretraining.☆31Updated last year
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆125Updated 11 months ago
- A python package for benchmarking interpretability techniques on Transformers.☆213Updated 4 months ago
- ☆65Updated last year
- Open source library for few shot NLP☆77Updated last year
- ☆42Updated last year
- ☆122Updated 2 years ago
- codebase release for EMNLP2023 paper publication☆19Updated 11 months ago
- triple-encoders is a library for contextualizing distributed Sentence Transformers representations.☆14Updated 5 months ago
- Explainable Zero-Shot Topic Extraction☆62Updated 6 months ago
- Pipeline for pulling and processing online language model pretraining data from the web☆175Updated last year
- Code for Multilingual Eval of Generative AI paper published at EMNLP 2023☆67Updated 11 months ago
- Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆116Updated 2 months ago
- Minimum Bayes Risk Decoding for Hugging Face Transformers☆56Updated 8 months ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆93Updated 2 years ago
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆67Updated 4 months ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.☆104Updated 9 months ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆151Updated 8 months ago