kathrinse / be_great

A novel approach for synthesizing tabular data using pretrained large language models

☆311

Alternatives and similar repositories for be_great:

Users that are interested in be_great are comparing it to the libraries listed below

worldbank / REaLTabFormer
A suite of auto-regressive and Seq2Seq (sequence-to-sequence) transformer models for tabular and relational synthetic data generation.
☆226Updated last month
Team-TUD / CTAB-GAN
Official git for "CTAB-GAN: Effective Table Data Synthesizing"
☆85Updated last year
sdv-dev / SDGym
Benchmarking synthetic data generation methods.
☆273Updated this week
Team-TUD / CTAB-GAN-Plus
Official GitHub for CTAB-GAN+
☆75Updated 11 months ago
kathrinse / TabSurvey
Experiments on Tabular Data Models
☆277Updated last year
yandex-research / tabular-dl-tabr
The implementation of "TabR: Unlocking the Power of Retrieval-Augmented Tabular Deep Learning"
☆291Updated 5 months ago
vanderschaarlab / hyperimpute
A framework for prototyping and benchmarking imputation methods
☆183Updated 2 years ago
yandex-research / tab-ddpm
[ICML 2023] The official implementation of the paper "TabDDPM: Modelling Tabular Data with Diffusion Models"
☆446Updated 9 months ago
zhao-zilong / Tabula
Official git for "TabuLa: Harnessing Language Models for Tabular Data Synthesis"
☆39Updated last week
sdv-dev / SDMetrics
Metrics to evaluate quality and efficacy of synthetic datasets.
☆231Updated 3 weeks ago
AI-sandbox / HyperFast
☆70Updated 2 months ago
naszilla / tabzilla
☆154Updated last year
vanderschaarlab / synthcity
A library for generating and evaluating synthetic tabular data for privacy, fairness and data augmentation.
☆545Updated 3 months ago
ykwon0407 / WeightedSHAP
WeightedSHAP: analyzing and improving Shapley based feature attributions (NeurIPS 2022)
☆160Updated 2 years ago
clinicalml / TabLLM
☆302Updated last year
LeoGrin / tabular-benchmark
☆478Updated 8 months ago
wwweiwei / awesome-self-supervised-learning-for-tabular-data
A collection of research materials on SSL for non-sequential tabular data (SSL4NSTD)
☆189Updated 2 months ago
ZhangTP1996 / TapTap
☆64Updated last year
SamsungSAILMontreal / ForestDiffusion
Generating and Imputing Tabular Data via Diffusion and Flow XGBoost Models
☆152Updated 9 months ago
tennisonliu / GOGGLE
☆27Updated 2 years ago
NextBrain-ai / nbsynthetic
nbsynthetic is simple and robust tabular synthetic data generation library for small and medium size datasets
☆66Updated 2 years ago
puhsu / tabular-dl-pretrain-objectives
Revisiting Pretrarining Objectives for Tabular Deep Learning
☆63Updated 2 years ago
autogluon / tabrepo
Compare and ensemble models without retraining
☆55Updated this week
LevinRoman / tabular-transfer-learning
A repo for transfer learning with deep tabular models
☆102Updated 2 years ago
iancovert / sage
For calculating global feature importance using Shapley values.
☆268Updated this week
Baukebrenninkmeijer / table-evaluator
Evaluate real and synthetic datasets against each other
☆88Updated 4 months ago
feedzai / fairgbm
Train Gradient Boosting models that are both high-performance *and* Fair!
☆104Updated 10 months ago
Diyago / Tabular-data-generation
We well know GANs for success in the realistic image generation. However, they can be applied in tabular data generation. We will review …
☆546Updated last month
noahho / CAAFE
Semi-automatic feature engineering process using Language Models and your dataset descriptions. Based on the paper "LLMs for Semi-Automat…
☆157Updated 4 months ago
aangelopoulos / ppi_py
A package for statistically rigorous scientific discovery using machine learning. Implements prediction-powered inference.
☆236Updated 4 months ago