datadreamer-dev / DataDreamer
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. β π€π€
β990Updated last month
Alternatives and similar repositories for DataDreamer:
Users that are interested in DataDreamer are comparing it to the libraries listed below
- Evaluate your LLM's response with Prometheus and GPT4 π―β883Updated this week
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifiβ¦β2,568Updated this week
- A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.β754Updated 3 weeks ago
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.β1,338Updated last month
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backendsβ1,313Updated this week
- Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipyβ1,063Updated last week
- Train Models Contrastively in Pytorchβ666Updated last month
- Automatically evaluate your LLMs in Google Colabβ603Updated 10 months ago
- Stanford NLP Python library for Representation Finetuning (ReFT)β1,445Updated last month
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.β2,312Updated this week
- A reading list on LLM based Synthetic Data Generation π₯β1,211Updated last month
- Automated Evaluation of RAG Systemsβ562Updated 4 months ago
- β501Updated 4 months ago
- Generative Representational Instruction Tuningβ610Updated last week
- [ACL2023] We introduce LLM-Blender, an innovative ensembling framework to attain consistently superior performance by leveraging the diveβ¦β922Updated 5 months ago
- Data and tools for generating and inspecting OLMo pre-training data.β1,162Updated last week
- Code and data for "Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs"β463Updated last year
- β1,011Updated 3 months ago
- Synthetic data curation for post-training and structured data extractionβ1,049Updated this week
- A library for advanced large language model reasoningβ2,060Updated last month
- Multi-LoRA inference server that scales to 1000s of fine-tuned LLMsβ2,802Updated 2 weeks ago
- TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients.β2,271Updated this week
- Minimalistic large language model 3D-parallelism trainingβ1,701Updated this week
- FacTool: Factuality Detection in Generative AIβ857Updated 7 months ago
- Recipes to scale inference-time compute of open modelsβ1,041Updated 3 weeks ago
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAIβ1,372Updated 11 months ago
- Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'β1,453Updated last month
- YaRN: Efficient Context Window Extension of Large Language Modelsβ1,450Updated 11 months ago
- RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.β413Updated last week
- Generate textbook-quality synthetic LLM pretraining dataβ498Updated last year