statice / awesome-synthetic-data
A curated list of awesome synthetic data tools (open source and commercial).
β93Updated 8 months ago
Related projects: β
- π A curated list of resources dedicated to synthetic dataβ115Updated 2 years ago
- A novel approach for synthesizing tabular data using pretrained large language modelsβ271Updated 3 months ago
- A suite of auto-regressive and Seq2Seq (sequence-to-sequence) transformer models for tabular and relational synthetic data generation.β203Updated last month
- Fiddler Auditor is a tool to evaluate language models.β163Updated 6 months ago
- LOTUS: The semantic query engine - process data with LMs as easily as writing pandas codeβ164Updated 3 weeks ago
- A Natural Language Interface to Explainable Boosting Machinesβ59Updated 2 months ago
- Automated knowledge graph creation SDKβ95Updated 2 months ago
- Metrics to evaluate quality and efficacy of synthetic datasets.β201Updated this week
- A Unified Framework for Quantifying Privacy Risk in Synthetic Data according to the GDPRβ66Updated 2 months ago
- AIDE: the Machine Learning CodeGen Agentβ306Updated 3 months ago
- π A curated list of papers & technical articles on AI Quality & Safetyβ155Updated 11 months ago
- Data for the Chat With Your Data benchmark.β123Updated 9 months ago
- A curated list of awesome resources for creating synthetic dataβ37Updated 2 years ago
- Synthetic data generators for structured and unstructured text, featuring differentially private learning.β579Updated last week
- Work on using Retrieval-Augmented Generation (RAG) to combine Fast Healthcare Interoperable Resources (FHIR) with Generative AI.β76Updated 5 months ago
- TalkToModel gives anyone with the powers of XAI through natural language conversations π¬!β108Updated last year
- Sample notebooks and prompts for LLM evaluationβ104Updated 5 months ago
- Generative models to automatically anonymize data to meet GDPR & CCPA standards.β29Updated last year
- Semi-automatic feature engineering process using Language Models and your dataset descriptions. Based on the paper "LLMs for Semi-Automatβ¦β121Updated 7 months ago
- Generative AI Deep Dive Workshopsβ66Updated 2 weeks ago
- Repository to demonstrate Chain of Table reasoning with multiple tables powered by LangGraphβ143Updated 5 months ago
- This is an open-source tool to assess and improve the trustworthiness of AI systems.β70Updated this week
- Including Hugging Face Deep learning Containers for Google Cloudβ106Updated this week
- SDNist: Benchmark data and evaluation tools for data synthesizers.β31Updated 3 months ago
- nbsynthetic is simple and robust tabular synthetic data generation library for small and medium size datasetsβ62Updated last year
- β56Updated 5 months ago
- β127Updated 2 months ago
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on taskβ¦β117Updated 3 weeks ago
- Google Deepmind's PromptBreeder for automated prompt engineering implemented in langchain expression language.β52Updated last month
- Framework for LLM evaluation, guardrails and securityβ94Updated last week