defog-ai / defog-data
This repository contains the metadata and data of different databases that we use for testing
☆12Updated this week
Alternatives and similar repositories for defog-data:
Users that are interested in defog-data are comparing it to the libraries listed below
- Introduction page of a challenging text-to-SQL dataset: KaggleDBQA☆35Updated last year
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆124Updated 10 months ago
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆66Updated 3 months ago
- ☆47Updated 6 months ago
- AuditNLG: Auditing Generative AI Language Modeling for Trustworthiness☆97Updated this week
- A desktop compatible version of the Defog app☆10Updated 5 months ago
- Translating natural language questions to a structured query language☆224Updated last year
- ☆14Updated this week
- Code for Search-in-the-Chain: Towards Accurate, Credible and Traceable Large Language Models for Knowledge-intensive Tasks☆50Updated 10 months ago
- GAP-text2SQL: Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training☆101Updated 10 months ago
- CLIR version of ColBERT☆67Updated 4 months ago
- Code for Multilingual Eval of Generative AI paper published at EMNLP 2023☆67Updated 10 months ago
- ☆16Updated 8 months ago
- ☆97Updated 2 years ago
- Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data☆101Updated 3 years ago
- This repository contains all the code for the DTS-SQL paper☆46Updated 6 months ago
- Evaluating tool-augmented LLMs in conversation settings☆76Updated 8 months ago
- Dataset from the paper "Mintaka: A Complex, Natural, and Multilingual Dataset for End-to-End Question Answering" (COLING 2022)☆107Updated 2 years ago
- codebase release for EMNLP2023 paper publication☆19Updated 11 months ago
- ☆132Updated last year
- ☆40Updated 2 months ago
- IndicGenBench is a high-quality, multilingual, multi-way parallel benchmark for evaluating Large Language Models (LLMs) on 4 user-facing …☆43Updated 5 months ago
- Dataset and code for EMNLP2020 paper "HybridQA: A Dataset of Multi-Hop Question Answeringover Tabular and Textual Data"☆224Updated last year
- Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."☆63Updated last year
- Inference engine for GLiNER models, in Rust☆37Updated this week
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆74Updated 3 months ago
- A large-scale multilingual dataset for Information Retrieval. Thorough human-annotations across 18 diverse languages.☆173Updated 6 months ago
- This is the code for our KILT leaderboard submissions (KGI + Re2G models).☆152Updated last year
- ☆50Updated 3 months ago
- Pre-train Static Word Embeddings☆42Updated this week