weAIDB/awesome-data-llm

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/weAIDB/awesome-data-llm)

weAIDB / awesome-data-llm

Official Repository of "LLM × DATA" Survey Paper

☆746

Alternatives and similar repositories for awesome-data-llm

Users that are interested in awesome-data-llm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

TsinghuaDatabaseGroup / AIDB
View on GitHub
ai4db and db4ai work
☆817Dec 26, 2024Updated last year
code4DB / Index_EAB
View on GitHub
☆12Jul 11, 2025Updated 8 months ago
LumingSun / ML4DB-paper-list
View on GitHub
Papers for database systems powered by artificial intelligence (machine learning for database)
☆772Mar 5, 2026Updated 3 weeks ago
Blondig / Lero-on-PostgreSQL
View on GitHub
☆34Sep 19, 2023Updated 2 years ago
hyrise / rl_index_selection
View on GitHub
Paper repository for "SWIRL: Selection of Workload-aware Indexes using Reinforcement Learning" (EDBT 2022)
☆40Jul 12, 2025Updated 8 months ago
DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
zhaoyue-ntu / qp_evaluation
View on GitHub
Query Plan Evaluation
☆16Jul 18, 2023Updated 2 years ago
db-tu-dresden / FASTgres-PVLDBv16
View on GitHub
☆16Aug 17, 2023Updated 2 years ago
learnedsystems / BaoForPostgreSQL
View on GitHub
A prototype implementation of Bao for PostgreSQL
☆216Sep 17, 2024Updated last year
TsinghuaDatabaseGroup / DB-GPT
View on GitHub
An LLM Based Diagnosis System (https://arxiv.org/pdf/2312.01454.pdf)
☆703Dec 27, 2025Updated 3 months ago
zhaoyue-ntu / QueryFormer
View on GitHub
☆61May 12, 2024Updated last year
XuanheZhou / ChatBase
View on GitHub
☆21Jul 20, 2024Updated last year
microsoft / dsb
View on GitHub
The DSB benchmark is designed for evaluating both workloaddriven and traditional database systems on modern decision support workloads. D…
☆73Nov 8, 2024Updated last year
hjhhsy120 / DBPA
View on GitHub
A Benchmark for Transactional Database Performance Anomalies
☆12Nov 21, 2023Updated 2 years ago
HKUSTDial / NL2SQL_Handbook
View on GitHub
This is a continuously updated handbook for readers to easily track the latest Text-to-SQL techniques in the literature and provide pract…
☆1,373Mar 3, 2026Updated 3 weeks ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
XuanheZhou / LearnedRewrite
View on GitHub
An online logical query rewrite demo (schema+sql only)!
☆41Jul 25, 2023Updated 2 years ago
alibaba / pilotscope
View on GitHub
PilotScope is a middleware to bridge the gaps of deploying AI4DB (Artificial Intelligence for Databases) algorithms into actual database …
☆166Oct 12, 2024Updated last year
StCarmen / PRICE
View on GitHub
A Pretrained Model for Cross-Database Cardinality Estimation
☆32Apr 30, 2025Updated 10 months ago
Nathaniel-Han / End-to-End-CardEst-Benchmark
View on GitHub
A new CardEst Benchmark to Bridge AI and DBMS
☆133Mar 14, 2023Updated 3 years ago
TsinghuaDatabaseGroup / datasets
View on GitHub
datasets for database research
☆15Aug 25, 2023Updated 2 years ago
DAMO-NLP-SG / LLM-R2
View on GitHub
☆52Nov 26, 2024Updated last year
superctj / observatory
View on GitHub
Characterization of relational table embeddings (VLDB 2024).
☆32Jul 1, 2024Updated last year
danolivo / jo-bench
View on GitHub
Join Order Benchmark (implicit fork of https://github.com/gregrahn/join-order-benchmark)
☆23Feb 4, 2026Updated last month
RyanMarcus / imdb_pg_dataset
View on GitHub
A Vagrant box that automatically loads the IMDB dataset into Postgres
☆79Mar 22, 2024Updated 2 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
HKUSTDial / StatQA
View on GitHub
🔥[NeurIPS'24] Official repository for the paper “Are Large Language Models Good Statisticians?”
☆32Apr 13, 2025Updated 11 months ago
parimarjan / db-embedding-tools
View on GitHub
blah
☆35May 5, 2019Updated 6 years ago
megagonlabs / sudowoodo
View on GitHub
The source code of the Sudowoodo paper in ICDE 2023
☆18May 24, 2023Updated 2 years ago
Wind-Gone / awesome-ai4db-paper
View on GitHub
Paper related to AI4DB techniques
☆106Mar 2, 2026Updated 3 weeks ago
curtis-sun / LLM4Rewrite
View on GitHub
☆21Dec 2, 2025Updated 3 months ago
mitdbg / palimpzest
View on GitHub
A System for Optimized Semantic Computation
☆207Mar 20, 2026Updated last week
SolidLao / GPTuner
View on GitHub
GPTuner is a manual-reading database tuning system leveraging domain knowlege automatically and extensively to enhance knob tuning proces…
☆126Jul 3, 2025Updated 8 months ago
AlibabaIncubator / Lero-on-PostgreSQL
View on GitHub
☆35Aug 31, 2022Updated 3 years ago
HKUSTDial / HAIChart
View on GitHub
Official repository for the paper “HAIChart: Human and AI Paired Visualization System” (VLDB'24)
☆34Nov 4, 2024Updated last year
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
wuziniu / unified_model
View on GitHub
A Unified Transferable Model for ML-Enhanced DBMS
☆14Feb 2, 2022Updated 4 years ago
tiannuo-yang / VDTuner
View on GitHub
[ICDE 2024] VDTuner - Automated Performance Tuning for Vector Data Management Systems (Vector Databases)
☆35Apr 21, 2024Updated last year
HKUSTDial / NL2SQL-Bugs-Benchmark
View on GitHub
🔥[SIGKDD'25] NL2SQL-BUGs: A Benchmark for Detecting Semantic Errors in NL2SQL Translation.
☆32Sep 22, 2025Updated 6 months ago
parimarjan / LatencyPredictor
View on GitHub
☆10Nov 16, 2023Updated 2 years ago
gregrahn / join-order-benchmark
View on GitHub
Join Order Benchmark (JOB)
☆352Feb 16, 2025Updated last year
DataManagementLab / zero-shot-cost-estimation
View on GitHub
Implementation of our VLDB'22 paper "Zero-Shot Cost Models for Out-of-the-box Learned Cost Prediction"
☆54Nov 11, 2022Updated 3 years ago
HKUSTDial / LEAD
View on GitHub
🔥[VLDB'26] Official repository for the paper "LEAD: Iterative Data Selection for Efficient LLM Instruction Tuning".
☆109Jun 3, 2025Updated 9 months ago