huggingface / evaluateLinks
π€ Evaluate: A library for easily evaluating machine learning models and datasets.
β2,385Updated last month
Alternatives and similar repositories for evaluate
Users that are interested in evaluate are comparing it to the libraries listed below
Sorting:
- π Accelerate inference and training of π€ Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimizationβ¦β3,225Updated this week
- β1,557Updated last week
- A Unified Library for Parameter-Efficient and Modular Transfer Learningβ2,791Updated 2 months ago
- The hub for EleutherAI's work on interpretability and learning dynamicsβ2,691Updated last month
- β2,920Updated 2 weeks ago
- β1,256Updated last year
- Efficient few-shot learning with Sentence Transformersβ2,647Updated 2 weeks ago
- Holistic Evaluation of Language Models (HELM) is an open source Python framework created by the Center for Research on Foundation Models β¦β2,594Updated this week
- Toolkit for creating, sharing and using natural language prompts.β2,984Updated 2 years ago
- Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language modelsβ3,176Updated last year
- A modular RL library to fine-tune language models to human preferencesβ2,378Updated last year
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,398Updated last week
- PyTorch extensions for high performance and large scale training.β3,392Updated 7 months ago
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.β2,780Updated this week
- General technology for enabling AI capabilities w/ LLMs and MLLMsβ4,229Updated 2 weeks ago
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)β4,732Updated last year
- MTEB: Massive Text Embedding Benchmarkβ3,036Updated last week
- Cramming the training of a (BERT-type) language model into limited compute.β1,355Updated last year
- Organize your experiments into discrete steps that can be cached and reused throughout the lifetime of your research project.β564Updated last year
- The implementation of DeBERTaβ2,182Updated 2 years ago
- Model explainability that works seamlessly with π€ transformers. Explain your transformers model in just 2 lines of code.β1,400Updated 2 years ago
- Efficient, scalable and enterprise-grade CPU/GPU inference server for π€ Hugging Face transformer models πβ1,689Updated last year
- Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.β1,007Updated last year
- Measuring Massive Multitask Language Understanding | ICLR 2021β1,537Updated 2 years ago
- Original Implementation of Prompt Tuning from Lester, et al, 2021β700Updated 9 months ago
- BERT score for text generationβ1,859Updated last year
- Foundation Architecture for (M)LLMsβ3,128Updated last year
- Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.β1,991Updated last week
- Expanding natural instructionsβ1,028Updated 2 years ago
- Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"β1,808Updated 6 months ago