awslabs / unified-text2sql-benchmarkLinks
UNITE: A Unified Benchmark for Text-to-SQL Evaluation
☆79Updated 2 months ago
Alternatives and similar repositories for unified-text2sql-benchmark
Users that are interested in unified-text2sql-benchmark are comparing it to the libraries listed below
Sorting:
- ☆97Updated last week
- ☆374Updated last year
- The prediction results of ChatGPT on various datasets of Text-to-SQL.☆102Updated 2 years ago
- Semantic Evaluation for Text-to-SQL with Distilled Test Suites☆288Updated last year
- Introduction page of a challenging text-to-SQL dataset: KaggleDBQA☆38Updated last year
- ☆50Updated 8 months ago
- Using Large Language Models (LLMs) to convert natural language queries to sql☆48Updated 9 months ago
- Numbers Station Text to SQL model code.☆247Updated last year
- The code for the paper C3: Zero-shot Text-to-SQL with ChatGPT☆153Updated last year
- This repository contains all the code for the DTS-SQL paper☆53Updated last year
- Code and data for the paper "DBCᴏᴘɪʟᴏᴛ: Natural Language Querying over Massive Database via Schema Routing" (EDBT 2025)☆114Updated 3 months ago
- Evaluate the accuracy of LLM generated outputs☆691Updated 2 months ago
- Contextual Harnessing for Efficient SQL Synthesis☆227Updated 2 months ago
- MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL☆274Updated 5 months ago
- RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.☆508Updated this week
- [NAACL'24] Dataset, code and models for "TableLlama: Towards Open Large Generalist Models for Tables".☆130Updated last year
- The Pytorch implementation of RESDSQL (AAAI 2023).☆259Updated last year
- ☆317Updated last year
- A collection of architectural patterns leveraging Large Language Models (LLMs) for efficient Text-to-SQL generation.☆235Updated last year
- Automated Evaluation of RAG Systems☆637Updated 4 months ago
- Evaluation tools for Retrieval-augmented Generation (RAG) methods.☆162Updated 8 months ago
- Comprehensive benchmark for RAG☆204Updated last month
- Dense X Retrieval: What Retrieval Granularity Should We Use?☆159Updated last year
- The source code of CodeS (SIGMOD 2024).☆180Updated 8 months ago
- ToolQA, a new dataset to evaluate the capabilities of LLMs in answering challenging questions with external tools. It offers two levels …☆272Updated last year
- RefChecker provides automatic checking pipeline and benchmark dataset for detecting fine-grained hallucinations generated by Large Langua…☆383Updated 2 months ago
- The source code for the schema filter (question + schema only)☆45Updated last year
- ☆112Updated last year
- [ACL Findings 2024] Decomposition for Enhancing Attention: Improving LLM-based Text-to-SQL through Workflow Paradigm☆43Updated last year
- Data for paper "Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness"☆31Updated 2 years ago