IBM / unitxtLinks
π¦ Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data for end-to-end AI benchmarking
β211Updated last week
Alternatives and similar repositories for unitxt
Users that are interested in unitxt are comparing it to the libraries listed below
Sorting:
- Let's build better datasets, together!β264Updated 10 months ago
- codebase release for EMNLP2023 paper publicationβ19Updated 2 months ago
- β43Updated last year
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo rankerβ122Updated 2 weeks ago
- Manage scalable open LLM inference endpoints in Slurm clustersβ276Updated last year
- β138Updated 2 months ago
- Codebase accompanying the Summary of a Haystack paper.β79Updated last year
- β119Updated last year
- Toolkit for attaching, training, saving and loading of new heads for transformer modelsβ290Updated 8 months ago
- Datasets collection and preprocessings framework for NLP extreme multitask learningβ188Updated 4 months ago
- awesome synthetic (text) datasetsβ305Updated this week
- This is the reproduction repository for my π€ Hugging Face blog post on synthetic dataβ68Updated last year
- Mixing Language Models with Self-Verification and Meta-Verificationβ109Updated 11 months ago
- Model, Code & Data for the EMNLP'23 paper "Making Large Language Models Better Data Creators"β134Updated 2 years ago
- π Collection of tuning recipes with HuggingFace SFTTrainer and PyTorch FSDP.β52Updated last week
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β50Updated last year
- Code accompanying "How I learned to start worrying about prompt formatting".β110Updated 5 months ago
- Notebooks for training universal 0-shot classifiers on many different tasksβ136Updated 10 months ago
- Lightweight demos for finetuning LLMs. Powered by π€ transformers and open-source datasets.β78Updated last year
- minimal pytorch implementation of bm25 (with sparse tensors)β104Updated 3 weeks ago
- Using open source LLMs to build synthetic datasets for direct preference optimizationβ69Updated last year
- β256Updated 7 months ago
- Website for hosting the Open Foundation Models Cheat Sheet.β268Updated 6 months ago
- experiments with inference on llamaβ103Updated last year
- Attribute (or cite) statements generated by LLMs back to in-context information.β297Updated last year
- Large-language Model Evaluation framework with Elo Leaderboard and A-B testingβ52Updated last year
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).β79Updated last year
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language Mβ¦β242Updated last year
- β58Updated last year
- Evaluating LLMs with CommonGen-Liteβ91Updated last year