kathrinse / be_great
A novel approach for synthesizing tabular data using pretrained large language models
☆311Updated 6 months ago
Alternatives and similar repositories for be_great:
Users that are interested in be_great are comparing it to the libraries listed below
- A suite of auto-regressive and Seq2Seq (sequence-to-sequence) transformer models for tabular and relational synthetic data generation.☆226Updated last month
- Official git for "CTAB-GAN: Effective Table Data Synthesizing"☆85Updated last year
- Benchmarking synthetic data generation methods.☆273Updated this week
- Official GitHub for CTAB-GAN+☆75Updated 11 months ago
- Experiments on Tabular Data Models☆277Updated last year
- The implementation of "TabR: Unlocking the Power of Retrieval-Augmented Tabular Deep Learning"☆291Updated 5 months ago
- A framework for prototyping and benchmarking imputation methods☆183Updated 2 years ago
- [ICML 2023] The official implementation of the paper "TabDDPM: Modelling Tabular Data with Diffusion Models"☆446Updated 9 months ago
- Official git for "TabuLa: Harnessing Language Models for Tabular Data Synthesis"☆39Updated last week
- Metrics to evaluate quality and efficacy of synthetic datasets.☆231Updated 3 weeks ago
- ☆70Updated 2 months ago
- ☆154Updated last year
- A library for generating and evaluating synthetic tabular data for privacy, fairness and data augmentation.☆545Updated 3 months ago
- WeightedSHAP: analyzing and improving Shapley based feature attributions (NeurIPS 2022)☆160Updated 2 years ago
- ☆302Updated last year
- ☆478Updated 8 months ago
- A collection of research materials on SSL for non-sequential tabular data (SSL4NSTD)☆189Updated 2 months ago
- ☆64Updated last year
- Generating and Imputing Tabular Data via Diffusion and Flow XGBoost Models☆152Updated 9 months ago
- ☆27Updated 2 years ago
- nbsynthetic is simple and robust tabular synthetic data generation library for small and medium size datasets☆66Updated 2 years ago
- Revisiting Pretrarining Objectives for Tabular Deep Learning☆63Updated 2 years ago
- Compare and ensemble models without retraining☆55Updated this week
- A repo for transfer learning with deep tabular models☆102Updated 2 years ago
- For calculating global feature importance using Shapley values.☆268Updated this week
- Evaluate real and synthetic datasets against each other☆88Updated 4 months ago
- Train Gradient Boosting models that are both high-performance *and* Fair!☆104Updated 10 months ago
- We well know GANs for success in the realistic image generation. However, they can be applied in tabular data generation. We will review …☆546Updated last month
- Semi-automatic feature engineering process using Language Models and your dataset descriptions. Based on the paper "LLMs for Semi-Automat…☆157Updated 4 months ago
- A package for statistically rigorous scientific discovery using machine learning. Implements prediction-powered inference.☆236Updated 4 months ago