huggingface / evaluateLinks
π€ Evaluate: A library for easily evaluating machine learning models and datasets.
β2,346Updated last month
Alternatives and similar repositories for evaluate
Users that are interested in evaluate are comparing it to the libraries listed below
Sorting:
- A Unified Library for Parameter-Efficient and Modular Transfer Learningβ2,777Updated 2 weeks ago
- π Accelerate inference and training of π€ Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimizationβ¦β3,121Updated 2 weeks ago
- β1,250Updated last year
- β1,547Updated 2 months ago
- Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language modelsβ3,133Updated last year
- Holistic Evaluation of Language Models (HELM) is an open source Python framework created by the Center for Research on Foundation Models β¦β2,519Updated this week
- The hub for EleutherAI's work on interpretability and learning dynamicsβ2,648Updated 4 months ago
- Efficient few-shot learning with Sentence Transformersβ2,579Updated 2 months ago
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.β2,687Updated last week
- β2,895Updated last week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β9,211Updated last week
- A modular RL library to fine-tune language models to human preferencesβ2,363Updated last year
- Toolkit for creating, sharing and using natural language prompts.β2,958Updated 2 years ago
- Model explainability that works seamlessly with π€ transformers. Explain your transformers model in just 2 lines of code.β1,389Updated 2 years ago
- Accessible large language models via k-bit quantization for PyTorch.β7,659Updated 3 weeks ago
- PyTorch extensions for high performance and large scale training.β3,384Updated 6 months ago
- Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.β1,005Updated last year
- General technology for enabling AI capabilities w/ LLMs and MLLMsβ4,157Updated 3 months ago
- A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.β1,982Updated last week
- MTEB: Massive Text Embedding Benchmarkβ2,931Updated this week
- Organize your experiments into discrete steps that can be cached and reused throughout the lifetime of your research project.β565Updated last year
- Cramming the training of a (BERT-type) language model into limited compute.β1,348Updated last year
- Foundation Architecture for (M)LLMsβ3,119Updated last year
- Efficient, scalable and enterprise-grade CPU/GPU inference server for π€ Hugging Face transformer models πβ1,690Updated last year
- MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.β2,070Updated 3 months ago
- Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining theβ¦β2,061Updated last year
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backendsβ2,021Updated this week
- The implementation of DeBERTaβ2,161Updated 2 years ago
- SGPT: GPT Sentence Embeddings for Semantic Searchβ875Updated last year
- Data and tools for generating and inspecting OLMo pre-training data.β1,332Updated last month