kathrinse / be_great
A novel approach for synthesizing tabular data using pretrained large language models
☆309Updated 5 months ago
Alternatives and similar repositories for be_great:
Users that are interested in be_great are comparing it to the libraries listed below
- A suite of auto-regressive and Seq2Seq (sequence-to-sequence) transformer models for tabular and relational synthetic data generation.☆225Updated last month
- [ICML 2023] The official implementation of the paper "TabDDPM: Modelling Tabular Data with Diffusion Models"☆439Updated 9 months ago
- Official GitHub for CTAB-GAN+☆74Updated 11 months ago
- A framework for prototyping and benchmarking imputation methods☆179Updated 2 years ago
- Benchmarking synthetic data generation methods.☆272Updated this week
- Official git for "CTAB-GAN: Effective Table Data Synthesizing"☆85Updated last year
- Metrics to evaluate quality and efficacy of synthetic datasets.☆229Updated last week
- ☆70Updated last month
- Experiments on Tabular Data Models☆276Updated last year
- The implementation of "TabR: Unlocking the Power of Retrieval-Augmented Tabular Deep Learning"☆290Updated 5 months ago
- Official git for "TabuLa: Harnessing Language Models for Tabular Data Synthesis"☆39Updated last month
- ☆148Updated last year
- Official Implementations of "Mixed-Type Tabular Data Synthesis with Score-based Diffusion in Latent Space""☆139Updated 9 months ago
- A Natural Language Interface to Explainable Boosting Machines☆66Updated 9 months ago
- WeightedSHAP: analyzing and improving Shapley based feature attributions (NeurIPS 2022)☆160Updated 2 years ago
- ☆475Updated 8 months ago
- [ICLR 2025] TabDiff: a Mixed-type Diffusion Model for Tabular Data Generation☆64Updated last week
- A collection of research materials on SSL for non-sequential tabular data (SSL4NSTD)☆185Updated 2 months ago
- Generating and Imputing Tabular Data via Diffusion and Flow XGBoost Models☆148Updated 8 months ago
- TabReD: Analyzing Pitfalls and Filling the Gaps in Tabular Deep Learning Benchmarks☆68Updated 2 weeks ago
- Compare and ensemble models without retraining☆53Updated this week
- ☆64Updated last year
- pyDVL is a library of stable implementations of algorithms for data valuation and influence function computation☆124Updated last week
- Semi-automatic feature engineering process using Language Models and your dataset descriptions. Based on the paper "LLMs for Semi-Automat…☆154Updated 3 months ago
- Evaluate real and synthetic datasets against each other☆86Updated 3 months ago
- A repo for transfer learning with deep tabular models☆102Updated 2 years ago
- Tabular Deep Learning Library for PyTorch☆640Updated this week
- For calculating global feature importance using Shapley values.☆267Updated this week
- A library of Reversible Data Transforms☆124Updated last week
- ☆26Updated 2 years ago