weAIDB / awesome-data-llmLinks
Official Repository of "LLM × DATA" Survey Paper
☆625Updated 3 weeks ago
Alternatives and similar repositories for awesome-data-llm
Users that are interested in awesome-data-llm are comparing it to the libraries listed below
Sorting:
- Continuously updated paper list on advancements in Data Agents. Companion repo to our paper "A Survey of Data Agents: Emerging Paradigm o…☆354Updated 3 weeks ago
- an unstructured data analytics systems via LLM☆23Updated 5 months ago
- GPTuner is a manual-reading database tuning system leveraging domain knowlege automatically and extensively to enhance knob tuning proces…☆121Updated 6 months ago
- 🔥[ICML'25] Official repository for the paper "Alpha-SQL: Zero-Shot Text-to-SQL using Monte Carlo Tree Search"☆142Updated 2 weeks ago
- 🔥[VLDB'24] Official repository for the paper “The Dawn of Natural Language to SQL: Are We Fully Ready?”☆140Updated 3 months ago
- The source code of CodeS (SIGMOD 2024).☆194Updated last year
- 🔥[SIGKDD'25] NL2SQL-BUGs: A Benchmark for Detecting Semantic Errors in NL2SQL Translation.☆29Updated 4 months ago
- A live reading list for LLM data synthesis (Updated to July, 2025).☆438Updated 4 months ago
- PilotScope is a middleware to bridge the gaps of deploying AI4DB (Artificial Intelligence for Databases) algorithms into actual database …☆165Updated last year
- ☆51Updated last year
- ☆25Updated 7 months ago
- [NeurIPS'25] Official Repository for the Paper "SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning"☆119Updated 2 months ago
- Collection of training data management explorations for large language models☆336Updated last year
- ai4db and db4ai work☆815Updated last year
- Contextual Harnessing for Efficient SQL Synthesis☆259Updated 7 months ago
- [ICDE 2024] VDTuner - Automated Performance Tuning for Vector Data Management Systems (Vector Databases)☆33Updated last year
- This is a continuously updated handbook for readers to easily track the latest Text-to-SQL techniques in the literature and provide pract…☆1,251Updated last week
- 向量检索与 RAG 实践:技术、实现与应用☆146Updated last year
- A Text-to-SQL Agent with Self-Refinement, Format Restriction, and Column Exploration☆118Updated 5 months ago
- An LLM Based Diagnosis System (https://arxiv.org/pdf/2312.01454.pdf)☆692Updated 3 weeks ago
- PostgreSQL extension for supporting deep learning model inference within the database and vector storage☆58Updated 3 months ago
- Official repository for the paper "EllieSQL: Cost-Efficient Text-to-SQL with Complexity-Aware Routing".☆21Updated 5 months ago
- 🔥[NeurIPS'24] Official repository for the paper “Are Large Language Models Good Statisticians?”☆32Updated 9 months ago
- DataMosaic: Explainable and Verifiable Document-Based Data Analytics☆20Updated 6 months ago
- LLM-based Dialect Translation System☆76Updated 3 months ago
- [VLDB' 25] Synthesizing High-quality Text-to-SQL Data at Scale. SynSQL-2.5M is the first million-scale cross-domain text-to-SQL dataset.☆409Updated 4 months ago
- 🏆 Winning NeurIPS (NIPS) Competition Track: Big ANN, Practical Vector Search Challenge 2023. (see big-ann-benchmark https://big-ann-benc…☆30Updated last year
- A System for Optimized Semantic Computation☆186Updated last week
- The source code for the schema filter (question + schema only)☆47Updated last year
- Fine-Tuning Dataset Auto-Generation for Graph Query Languages.☆87Updated 2 months ago