Semi-automatic feature engineering process using Language Models and your dataset descriptions. Based on the paper "LLMs for Semi-Automated Data Science: Introducing CAAFE for Context-Aware Automated Feature Engineering" by Hollmann, Müller, and Hutter (2023).
☆188Dec 20, 2024Updated last year
Alternatives and similar repositories for CAAFE
Users that are interested in CAAFE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official implementation of Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning (NeurIPS 2024).☆33Mar 4, 2025Updated last year
- ☆44May 2, 2024Updated 2 years ago
- Tabular In-Context Learning☆114Mar 6, 2025Updated last year
- Foundation Model for Tabular Data via reticulate☆28Updated this week
- ⚡ Easy API access to the tabular foundation model TabPFN ⚡☆232Updated this week
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Neural Pipeline Search (NePS): Helps deep learning experts find the best neural pipeline.☆79Apr 22, 2026Updated last week
- TuneTables is a tabular classifier that implements prompt tuning for frozen prior-fitted networks.☆24Mar 31, 2025Updated last year
- Official implementation of "TabEBM: A Tabular Data Augmentation Method with Class-Specific Energy-Based Models", NeurIPS 2024☆25Aug 19, 2025Updated 8 months ago
- TabPFGen: Synthetic Tabular Data Generation with TabPFN☆40Jul 15, 2025Updated 9 months ago
- Ensemble-based, size-agnostic wrapper for the TabPFN classifier☆34May 18, 2024Updated last year
- Zero-shot Time Series Forecasting with TabPFN (work accepted at NeurIPS 2024 TRL and TSALM workshops)☆385Apr 26, 2026Updated last week
- The PyExperimenter is a tool for the automatic execution of experiments, e.g. for machine learning (ML), capturing corresponding results …☆39Mar 18, 2026Updated last month
- Code for "TabZilla: When Do Neural Nets Outperform Boosted Trees on Tabular Data?"☆180Mar 22, 2024Updated 2 years ago
- Interpretable ML for TabPFN☆51Jul 13, 2025Updated 9 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ⚡ TabPFN: Foundation Model for Tabular Data ⚡☆6,195Updated this week
- This work introduces LaT-PFN, a novel time series model that combines PFN and JEPA frameworks to generate zero-shot forecasts efficientl…☆20Aug 1, 2024Updated last year
- Our maintained PFN repository. Come here to train SOTA PFNs.☆140Jan 21, 2026Updated 3 months ago
- Official repository for "Construction of Hierarchical Neural Architecture Search Spaces based on Context-free Grammars" (NeurIPS 2023)☆17Oct 26, 2023Updated 2 years ago
- [ICLR 2025] DSBench: How Far are Data Science Agents from Becoming Data Science Experts?☆116Aug 17, 2025Updated 8 months ago
- ☆32Jan 28, 2025Updated last year
- ☆16Nov 25, 2022Updated 3 years ago
- A learning curve benchmark on OpenML data☆34Nov 29, 2024Updated last year
- ☆338Jun 19, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- TabICLv2: A state-of-the-art tabular foundation model☆810Updated this week
- ☆45Aug 2, 2024Updated last year
- ☆17Jan 23, 2023Updated 3 years ago
- OpenFE: automated feature generation with expert-level performance☆869May 27, 2024Updated last year
- ☆19Feb 28, 2025Updated last year
- ☆22Oct 30, 2024Updated last year
- ☆32Jun 24, 2024Updated last year
- A Framework for Comparing N Hyperparameter Optimizers on M Benchmarks.☆19Apr 22, 2026Updated last week
- Computing the gap statistics from Tibshirani et. al. for various clustering algorithms☆13Nov 10, 2025Updated 5 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- In-context Bayesian Optimization☆17Feb 20, 2026Updated 2 months ago
- [NeurIPS 2023] Multi-fidelity hyperparameter optimization with deep power laws that achieves state-of-the-art results across diverse benc…☆20Nov 12, 2023Updated 2 years ago
- [EMNLP 2024 Findings] Benchmarking Language Model Agents for Data-Driven Science☆35Oct 25, 2024Updated last year
- NLPBench: Evaluating NLP-Related Problem-solving Ability in Large Language Models☆10Oct 27, 2023Updated 2 years ago
- A Living Benchmark for Machine Learning on Tabular Data☆214Apr 24, 2026Updated last week
- ☆26Mar 9, 2023Updated 3 years ago
- AIDE: AI-Driven Exploration in the Space of Code. The machine Learning engineering agent that automates AI R&D.☆1,245Apr 21, 2026Updated last week