huggingface / evaluateLinks

🤗 Evaluate: A library for easily evaluating machine learning models and datasets.

☆2,359

Alternatives and similar repositories for evaluate

Users that are interested in evaluate are comparing it to the libraries listed below

Sorting:

google-research / FLAN
☆1,551Updated 2 weeks ago
huggingface / optimum
🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization…
☆3,164Updated 2 weeks ago
adapter-hub / adapters
A Unified Library for Parameter-Efficient and Modular Transfer Learning
☆2,783Updated last month
google / BIG-bench
Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models
☆3,144Updated last year
allenai / RL4LMs
A modular RL library to fine-tune language models to human preferences
☆2,366Updated last year
google-research / deduplicate-text-datasets
☆1,250Updated last year
stanford-crfm / helm
Holistic Evaluation of Language Models (HELM) is an open source Python framework created by the Center for Research on Foundation Models …
☆2,539Updated this week
bigscience-workshop / bigscience
Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.
☆1,006Updated last year
CarperAI / trlx
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
☆4,728Updated last year
google-research / t5x
☆2,907Updated this week
cdpierse / transformers-interpret
Model explainability that works seamlessly with 🤗 transformers. Explain your transformers model in just 2 lines of code.
☆1,392Updated 2 years ago
microsoft / LMOps
General technology for enabling AI capabilities w/ LLMs and MLLMs
☆4,175Updated this week
bigscience-workshop / promptsource
Toolkit for creating, sharing and using natural language prompts.
☆2,967Updated 2 years ago
huggingface / setfit
Efficient few-shot learning with Sentence Transformers
☆2,598Updated 3 months ago
huggingface / accelerate
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (i…
☆9,289Updated this week
huggingface / datatrove
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
☆2,726Updated this week
facebookresearch / fairscale
PyTorch extensions for high performance and large scale training.
☆3,385Updated 6 months ago
EleutherAI / pythia
The hub for EleutherAI's work on interpretability and learning dynamics
☆2,671Updated 5 months ago
beir-cellar / beir
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
☆1,996Updated last month
microsoft / DeBERTa
The implementation of DeBERTa
☆2,164Updated 2 years ago
castorini / pyserini
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
☆1,978Updated this week
anthropics / hh-rlhf
Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"
☆1,796Updated 5 months ago
embeddings-benchmark / mteb
MTEB: Massive Text Embedding Benchmark
☆2,964Updated this week
Tiiiger / bert_score
BERT score for text generation
☆1,838Updated last year
huggingface / lighteval
Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
☆2,108Updated this week
google-research / prompt-tuning
Original Implementation of Prompt Tuning from Lester, et al, 2021
☆696Updated 8 months ago
Muennighoff / sgpt
SGPT: GPT Sentence Embeddings for Semantic Search
☆873Updated last year
hendrycks / test
Measuring Massive Multitask Language Understanding | ICLR 2021
☆1,514Updated 2 years ago
jalammar / ecco
Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the…
☆2,062Updated last year
facebookresearch / DPR
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.
☆1,843Updated 2 years ago