Semi-automatic feature engineering process using Language Models and your dataset descriptions. Based on the paper "LLMs for Semi-Automated Data Science: Introducing CAAFE for Context-Aware Automated Feature Engineering" by Hollmann, Müller, and Hutter (2023).
☆180Dec 20, 2024Updated last year
Alternatives and similar repositories for CAAFE
Users that are interested in CAAFE are comparing it to the libraries listed below
Sorting:
- ☆43May 2, 2024Updated last year
- TuneTables is a tabular classifier that implements prompt tuning for frozen prior-fitted networks.☆23Mar 31, 2025Updated 11 months ago
- ⚡ Easy API access to the tabular foundation model TabPFN ⚡☆228Feb 23, 2026Updated last week
- Tabular In-Context Learning☆109Mar 6, 2025Updated 11 months ago
- Interpretable ML for TabPFN☆47Jul 13, 2025Updated 7 months ago
- ☆16May 26, 2022Updated 3 years ago
- The first collection of surrogate benchmarks for Joint Architecture and Hyperparameter Search.☆15Mar 22, 2023Updated 2 years ago
- Ensemble-based, size-agnostic wrapper for the TabPFN classifier☆34May 18, 2024Updated last year
- The PyExperimenter is a tool for the automatic execution of experiments, e.g. for machine learning (ML), capturing corresponding results …☆38Oct 10, 2025Updated 4 months ago
- Code accompanying https://arxiv.org/abs/1802.02219☆19Oct 5, 2022Updated 3 years ago
- ⚡ TabPFN: Foundation Model for Tabular Data ⚡☆5,766Updated this week
- ☆21Jan 13, 2022Updated 4 years ago
- ☆20Jun 3, 2023Updated 2 years ago
- Code for "TabZilla: When Do Neural Nets Outperform Boosted Trees on Tabular Data?"☆178Mar 22, 2024Updated last year
- ☆45Aug 2, 2024Updated last year
- Zero-shot Time Series Forecasting with TabPFN (work accepted at NeurIPS 2024 TRL and TSALM workshops)☆357Feb 19, 2026Updated last week
- ☆330Jun 19, 2024Updated last year
- A rule-based aproach to explain the output of any machine learning model☆15Apr 4, 2024Updated last year
- Package to estimate the grouping loss of a classifier, based on the paper "Beyond calibration: estimating the grouping loss of modern neu…☆11Dec 14, 2024Updated last year
- Official implementation of "TabEBM: A Tabular Data Augmentation Method with Class-Specific Energy-Based Models", NeurIPS 2024☆23Aug 19, 2025Updated 6 months ago
- a minimal website to get the diff of llm rewrites☆11Dec 11, 2024Updated last year
- Our maintained PFN repository. Come here to train SOTA PFNs.☆134Jan 21, 2026Updated last month
- Performant, composable online learning☆16Feb 22, 2021Updated 5 years ago
- ☆11Nov 5, 2024Updated last year
- A benchmark of meaningful graph datasets with tabular node features☆14Oct 29, 2025Updated 4 months ago
- Rice Yield CNN is a model to estimate the rice yield based on RGB image of rice canopy at harvest. The model is developed based on more t…☆11Nov 8, 2021Updated 4 years ago
- Official repository for the paper "Zero-Shot AutoML with Pretrained Models"☆48Dec 29, 2023Updated 2 years ago
- [ICLR 2025] DSBench: How Far are Data Science Agents from Becoming Data Science Experts?☆106Aug 17, 2025Updated 6 months ago
- ☆31Jun 24, 2024Updated last year
- [EMNLP 2024 Findings] Benchmarking Language Model Agents for Data-Driven Science☆34Oct 25, 2024Updated last year
- Official implementation of "DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning" in ICML'24☆226Dec 3, 2024Updated last year
- A Framework for Comparing N Hyperparameter Optimizers on M Benchmarks.☆19Feb 22, 2026Updated last week
- Code release for DeepEDM (ICML 2025)☆27Jan 20, 2026Updated last month
- ☆16Nov 25, 2022Updated 3 years ago
- An easy-to-use ML pipeline package for Python inspired by scikit-learn pipeline and PyTorch layers.☆12Aug 27, 2023Updated 2 years ago
- ☆27Mar 9, 2023Updated 2 years ago
- Multi-Agent System Powered by LLMs for End-to-end Multimodal ML Automation☆255Jan 30, 2026Updated last month
- [NeurIPS 2021] Well-tuned Simple Nets Excel on Tabular Datasets☆88Feb 28, 2023Updated 3 years ago
- Foundation Model for Tabular Data via reticulate☆21Dec 4, 2025Updated 2 months ago