NVIDIA-NeMo / DataDesignerLinks
π¨ NeMo Data Designer: A general library for generating high-quality synthetic data from scratch or based on seed data.
β674Updated this week
Alternatives and similar repositories for DataDesigner
Users that are interested in DataDesigner are comparing it to the libraries listed below
Sorting:
- An open-source tool for LLM prompt optimization.β759Updated last week
- A CLI to estimate inference memory requirements for Hugging Face models, written in Python.β646Updated this week
- Inference, Fine Tuning and many more recipes with Gemma family of modelsβ279Updated 6 months ago
- Collection of scripts and notebooks for OpenAI's latest GPT OSS modelsβ496Updated 5 months ago
- β271Updated last week
- An interface library for RL post training with environments.β1,112Updated this week
- Developer Asset Hub for NVIDIA Nemotron β A one-stop resource for training recipes, usage cookbooks, datasets, and full end-to-end refereβ¦β392Updated this week
- π€ Benchmark Large Language Models Reliably On Your Dataβ426Updated last month
- On the Theoretical Limitations of Embedding-Based Retrievalβ622Updated 4 months ago
- A Lightweight Library for AI Observabilityβ255Updated 11 months ago
- RAG evaluation without the need for "golden answers"β338Updated last month
- Provider-agnostic, open-source evaluation infrastructure for language modelsβ714Updated last month
- Build datasets using natural languageβ566Updated 4 months ago
- β237Updated 2 months ago
- Tool for generating high quality Synthetic datasetsβ1,484Updated 3 months ago
- Simple UI for debugging correlations of text embeddingsβ305Updated 8 months ago
- Code to accompany the Universal Deep Research paper (https://arxiv.org/abs/2509.00244)β459Updated 5 months ago
- Examples, end-2-end tutorials and apps built using Liquid AI Foundational Models (LFM) and the LEAP SDKβ1,009Updated last week
- An alignment auditing agent capable of quickly exploring alignment hypothesisβ874Updated this week
- β222Updated 6 months ago
- Super basic implementation (gist-like) of RLMs with REPL environments.β574Updated 3 weeks ago
- Together Open Deep Researchβ356Updated 9 months ago
- Semantic Chunker is a lightweight Python package for semantically-aware chunking and clustering of text.β292Updated 9 months ago
- Readymade evaluators for agent trajectoriesβ467Updated 4 months ago
- Workflows are an event-driven, async-first, step-based way to control the execution flow of AI applications like agents.β308Updated last week
- Tutorial for building LLM routerβ244Updated last year
- β181Updated 11 months ago
- Research repository on interfacing LLMs with Weaviate APIs. Inspired by the Berkeley Gorilla LLM.β140Updated 5 months ago
- Enterprise-grade memory framework for LLMs featuring GPU-optimized inference, vector storage, and automated scaling. Enables hyper-personβ¦β91Updated 9 months ago
- Generate High-Quality Synthetics, Train, Measure, and Evaluate in a Single Pipelineβ827Updated last week