huggingface / data-is-better-together
Let's build better datasets, together!
☆205Updated this week
Related projects ⓘ
Alternatives and complementary repositories for data-is-better-together
- awesome synthetic (text) datasets☆242Updated 3 weeks ago
- Manage scalable open LLM inference endpoints in Slurm clusters☆236Updated 4 months ago
- Toolkit for attaching, training, saving and loading of new heads for transformer models☆246Updated 2 weeks ago
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆106Updated 3 weeks ago
- ☆93Updated last month
- ☆129Updated 3 weeks ago
- ☆105Updated 2 months ago
- This is the reproduction repository for my 🤗 Hugging Face blog post on synthetic data☆61Updated 9 months ago
- ☆108Updated this week
- ☆131Updated 4 months ago
- ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paper…☆96Updated 7 months ago
- ☆66Updated this week
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆64Updated last month
- Late Interaction Models Training & Retrieval☆165Updated this week
- An Open Source Toolkit For LLM Distillation☆356Updated 2 months ago
- FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes☆183Updated last month
- A set of scripts and notebooks on LLM finetunning and dataset creation☆93Updated last month
- Codebase accompanying the Summary of a Haystack paper.☆72Updated 2 months ago
- Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs☆165Updated 2 weeks ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆97Updated 7 months ago
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…☆180Updated 3 weeks ago
- 📝 Reference-Free automatic summarization evaluation with potential hallucination detection☆98Updated 10 months ago
- Automatically evaluate your LLMs in Google Colab☆559Updated 6 months ago
- Easily embed, cluster and semantically label text datasets☆462Updated 7 months ago
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on task…☆133Updated last month
- Efficiently find the best-suited language model (LM) for your NLP task☆91Updated this week
- Notebooks for training universal 0-shot classifiers on many different tasks☆106Updated 7 months ago
- Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first app…☆161Updated 10 months ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆40Updated 8 months ago
- ☆451Updated 3 weeks ago