Semi-automatic feature engineering process using Language Models and your dataset descriptions. Based on the paper "LLMs for Semi-Automated Data Science: Introducing CAAFE for Context-Aware Automated Feature Engineering" by Hollmann, Müller, and Hutter (2023).
☆192Dec 20, 2024Updated last year
Alternatives and similar repositories for CAAFE
Users that are interested in CAAFE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official implementation of Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning (NeurIPS 2024).☆33Mar 4, 2025Updated last year
- Tabular In-Context Learning☆115Mar 6, 2025Updated last year
- Foundation Model for Tabular Data via reticulate☆33May 14, 2026Updated last week
- ⚡ Easy API access to the tabular foundation model TabPFN ⚡☆240Updated this week
- Neural Pipeline Search (NePS): Helps deep learning experts find the best neural pipeline.☆79May 6, 2026Updated 2 weeks ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Official implementation of "TabEBM: A Tabular Data Augmentation Method with Class-Specific Energy-Based Models", NeurIPS 2024☆25Aug 19, 2025Updated 9 months ago
- TabPFGen: Synthetic Tabular Data Generation with TabPFN☆41Jul 15, 2025Updated 10 months ago
- Ensemble-based, size-agnostic wrapper for the TabPFN classifier☆34May 18, 2024Updated 2 years ago
- The first collection of surrogate benchmarks for Joint Architecture and Hyperparameter Search.☆15Mar 22, 2023Updated 3 years ago
- Zero-shot Time Series Forecasting with TabPFN (work accepted at NeurIPS 2024 TRL and TSALM workshops)☆397May 15, 2026Updated last week
- a minimal website to get the diff of llm rewrites☆11Dec 11, 2024Updated last year
- The PyExperimenter is a tool for the automatic execution of experiments, e.g. for machine learning (ML), capturing corresponding results …☆39Mar 18, 2026Updated 2 months ago
- Amortized Inference for Causal Structure Learning, NeurIPS 2022☆76Feb 11, 2025Updated last year
- Code for "TabZilla: When Do Neural Nets Outperform Boosted Trees on Tabular Data?"☆181Mar 22, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ⚡ TabPFN: Foundation Model for Tabular Data ⚡☆7,057May 15, 2026Updated last week
- [TMLR 2026] LLM-FE: Automated Feature Engineering with Large Language Models☆73May 10, 2026Updated last week
- Official repository for "Construction of Hierarchical Neural Architecture Search Spaces based on Context-free Grammars" (NeurIPS 2023)☆17Oct 26, 2023Updated 2 years ago
- Our maintained PFN repository. Come here to train SOTA PFNs.☆144Jan 21, 2026Updated 4 months ago
- ☆35May 6, 2026Updated 2 weeks ago
- ☆16Nov 25, 2022Updated 3 years ago
- ☆342Jun 19, 2024Updated last year
- ☆21Jan 13, 2022Updated 4 years ago
- ☆19Feb 28, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Official implementation of "DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning" in ICML'24☆233Dec 3, 2024Updated last year
- ☆22Oct 30, 2024Updated last year
- Official repository for the paper "Zero-Shot AutoML with Pretrained Models"☆48Dec 29, 2023Updated 2 years ago
- In-context Bayesian Optimization☆18Feb 20, 2026Updated 3 months ago
- [NeurIPS 2023] Multi-fidelity hyperparameter optimization with deep power laws that achieves state-of-the-art results across diverse benc…☆20Nov 12, 2023Updated 2 years ago
- sktime - python toolbox for time series: pipelines and transformers☆26Dec 1, 2022Updated 3 years ago
- A Living Benchmark for Machine Learning on Tabular Data☆227Updated this week
- NLPBench: Evaluating NLP-Related Problem-solving Ability in Large Language Models☆10Oct 27, 2023Updated 2 years ago
- AIDE: AI-Driven Exploration in the Space of Code. The machine Learning engineering agent that automates AI R&D.☆1,277May 2, 2026Updated 3 weeks ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- [NeurIPS 2021] Well-tuned Simple Nets Excel on Tabular Datasets☆90Feb 28, 2023Updated 3 years ago
- Multi-Agent System Powered by LLMs for End-to-end Multimodal ML Automation☆280Mar 20, 2026Updated 2 months ago
- A benchmark of meaningful graph datasets with tabular node features☆15Oct 29, 2025Updated 6 months ago
- ☆91Jan 27, 2026Updated 3 months ago
- Tabular data imputation and generation, with flexible modeling of quantitative features via hierarchical binning (TMLR, 2025)☆16Mar 10, 2025Updated last year
- Performant, composable online learning☆16Feb 22, 2021Updated 5 years ago
- Training code for TabDPT: Scaling Tabular Foundation Models on Real Data☆56Aug 3, 2025Updated 9 months ago