MoritzLaurer / synthetic-data-blog
This is the reproduction repository for my π€ Hugging Face blog post on synthetic data
β67Updated last year
Alternatives and similar repositories for synthetic-data-blog:
Users that are interested in synthetic-data-blog are comparing it to the libraries listed below
- awesome synthetic (text) datasetsβ263Updated 4 months ago
- Using open source LLMs to build synthetic datasets for direct preference optimizationβ58Updated last year
- β113Updated 5 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing β‘β64Updated 4 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β49Updated 7 months ago
- Simple examples using Argilla tools to build AIβ53Updated 3 months ago
- β76Updated 8 months ago
- Fine-tune ModernBERT on a large Dataset with Custom Tokenizer Trainingβ61Updated 3 weeks ago
- Notebooks for training universal 0-shot classifiers on many different tasksβ119Updated 2 months ago
- π Reference-Free automatic summarization evaluation with potential hallucination detectionβ100Updated last year
- Codebase accompanying the Summary of a Haystack paper.β75Updated 5 months ago
- Set of scripts to finetune LLMsβ36Updated 11 months ago
- Generalist and Lightweight Model for Text Classificationβ87Updated this week
- ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paperβ¦β102Updated 10 months ago
- My personal siteβ72Updated 6 months ago
- Evaluating LLMs with CommonGen-Liteβ89Updated 11 months ago
- Lightweight demos for finetuning LLMs. Powered by π€ transformers and open-source datasets.β73Updated 4 months ago
- experiments with inference on llamaβ104Updated 8 months ago
- β142Updated 7 months ago
- β65Updated 9 months ago
- Manage scalable open LLM inference endpoints in Slurm clustersβ253Updated 7 months ago
- β92Updated 2 weeks ago
- β110Updated 6 months ago
- Model, Code & Data for the EMNLP'23 paper "Making Large Language Models Better Data Creators"β128Updated last year
- Low latency, High Accuracy, Custom Query routers for Humans and Agents. Built by Prithivi Daβ94Updated 2 months ago
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. π¨π»βπ³β259Updated 2 months ago
- β150Updated 3 months ago
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).β80Updated 11 months ago
- β74Updated last year