gretelai / awesome-synthetic-data
π A curated list of resources dedicated to synthetic data
β127Updated 2 years ago
Alternatives and similar repositories for awesome-synthetic-data:
Users that are interested in awesome-synthetic-data are comparing it to the libraries listed below
- A curated list of awesome synthetic data tools (open source and commercial).β162Updated last year
- Use FastCUT with public map images and location data from a few cities to generate realistic synthetic location data for any city in the β¦β23Updated 3 years ago
- Simple interface to synthesize complex and highly dimensional datasets using Gretel APIs.β29Updated 3 weeks ago
- A curated list of awesome resources for creating synthetic dataβ42Updated 3 years ago
- nbsynthetic is simple and robust tabular synthetic data generation library for small and medium size datasetsβ65Updated 2 years ago
- The Gretel Python Client allows you to interact with the Gretel REST API.β53Updated this week
- Where Gretel published notebooks and code for blog postsβ19Updated last year
- A suite of auto-regressive and Seq2Seq (sequence-to-sequence) transformer models for tabular and relational synthetic data generation.β223Updated 2 weeks ago
- Stanford CRFM's initiative to assess potential compliance with the draft EU AI Actβ93Updated last year
- Generative models to automatically anonymize data to meet GDPR & CCPA standards.β31Updated 2 years ago
- Metrics to evaluate quality and efficacy of synthetic datasets.β228Updated this week
- Public blueprints for data use casesβ74Updated this week
- Fiddler Auditor is a tool to evaluate language models.β176Updated last year
- NIST Collaborative Research Cycle on Synthetic Data. Learn about Synthetic Data week by week!β27Updated last year
- A library of Reversible Data Transformsβ124Updated this week
- Foundation Models for Data Tasksβ103Updated last year
- MirrorDataGenerator is a python tool that generates synthetic data based on user-specified causal relations among features in the data. Iβ¦β21Updated 2 years ago
- Synthetic data generators for structured and unstructured text, featuring differentially private learning.β618Updated last week
- β76Updated 9 months ago
- ReLM is a Regular Expression engine for Language Modelsβ103Updated last year
- Code for paper: "Privately generating tabular data using language models".β15Updated last year
- Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuningβ46Updated last year
- Research on Tabular Foundation Modelsβ43Updated 3 months ago
- The code for the paper ROUTERBENCH: A Benchmark for Multi-LLM Routing Systemβ109Updated 9 months ago
- A novel approach for synthesizing tabular data using pretrained large language modelsβ303Updated 4 months ago
- Semi-automatic feature engineering process using Language Models and your dataset descriptions. Based on the paper "LLMs for Semi-Automatβ¦β155Updated 3 months ago
- Private Evolution: Generating DP Synthetic Data without Training [ICLR 2024, ICML 2024 Spotlight]β91Updated last month
- β263Updated 2 months ago
- pyCANON is a Python library and CLI to assess the values of the parameters associated with the most common privacy-preserving techniques.β35Updated this week
- Study the temporal performance degradation of machine learning models.β15Updated last year