gretelai / awesome-synthetic-dataLinks
π A curated list of resources dedicated to synthetic data
β129Updated 2 years ago
Alternatives and similar repositories for awesome-synthetic-data
Users that are interested in awesome-synthetic-data are comparing it to the libraries listed below
Sorting:
- A suite of auto-regressive and Seq2Seq (sequence-to-sequence) transformer models for tabular and relational synthetic data generation.β228Updated this week
- Public blueprints for data use casesβ78Updated last week
- Use FastCUT with public map images and location data from a few cities to generate realistic synthetic location data for any city in the β¦β23Updated 3 years ago
- A curated list of awesome resources for creating synthetic dataβ42Updated 3 years ago
- A curated list of awesome synthetic data tools (open source and commercial).β182Updated last year
- Federated Learning Utilities and Tools for Experimentationβ190Updated last year
- Fiddler Auditor is a tool to evaluate language models.β181Updated last year
- Simple interface to synthesize complex and highly dimensional datasets using Gretel APIs.β28Updated 3 months ago
- The Gretel Python Client allows you to interact with the Gretel REST API.β56Updated this week
- A software package for privacy-preserving generation of a synthetic twin to a given sensitive data set.β53Updated 9 months ago
- Where Gretel published notebooks and code for blog postsβ18Updated last year
- SDNist: Benchmark data and evaluation tools for data synthesizers.β36Updated 2 weeks ago
- β22Updated last year
- Generative models to automatically anonymize data to meet GDPR & CCPA standards.β31Updated 2 years ago
- Differentially-private transformers using HuggingFace and Opacusβ140Updated 9 months ago
- Metrics to evaluate quality and efficacy of synthetic datasets.β236Updated last week
- Foundation Models for Data Tasksβ106Updated 2 years ago
- nbsynthetic is simple and robust tabular synthetic data generation library for small and medium size datasetsβ68Updated 2 years ago
- The repository contains the code for analysing the leakage of personally identifiable (PII) information from the output of next word predβ¦β96Updated 9 months ago
- A Natural Language Interface to Explainable Boosting Machinesβ67Updated 11 months ago
- A novel approach for synthesizing tabular data using pretrained large language modelsβ311Updated 2 weeks ago
- pyCANON is a Python library and CLI to assess the values of the parameters associated with the most common privacy-preserving techniques.β37Updated last week
- TalkToModel gives anyone with the powers of XAI through natural language conversations π¬!β120Updated last year
- Stanford CRFM's initiative to assess potential compliance with the draft EU AI Actβ94Updated last year
- ReLM is a Regular Expression engine for Language Modelsβ105Updated 2 years ago
- A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.β150Updated this week
- Private Evolution: Generating DP Synthetic Data without Training [ICLR 2024, ICML 2024 Spotlight]β96Updated last week
- This repository provides a curated list of references about Machine Learning Model Governance, Ethics, and Responsible AI.β114Updated last year
- Code for paper: "Privately generating tabular data using language models".β15Updated last year
- Benchmarking synthetic data generation methods.β274Updated this week