ddenron / deco_datasetLinks
This repository holds the annotated spreadsheet files, comprising the DECO dataset.
☆13Updated 6 years ago
Alternatives and similar repositories for deco_dataset
Users that are interested in deco_dataset are comparing it to the libraries listed below
Sorting:
- ☆39Updated 4 years ago
- Incorporating VIsual LAyout Structures for Scientific Text Classification☆179Updated 2 years ago
- Zero-shot entity linking with less data☆12Updated 3 years ago
- Code and data for "TURL: Table Understanding through Representation Learning"☆123Updated 3 years ago
- TUTA and ForTaP for Structure-Aware and Numerical-Reasoning-Aware Table Pre-Training☆119Updated 9 months ago
- Code and experiment data for ICDM'19 paper, tabular cell classification using pre-trained cell embeddings. Note that the code and data is…☆27Updated 2 years ago
- multimodal document analysis☆164Updated last year
- TAT-QA (Tabular And Textual dataset for Question Answering) contains 16,552 questions associated with 2,757 hybrid contexts from real-wor…☆116Updated 8 months ago
- [ACL 2022] A hierarchical table dataset for question answering and data-to-text generation.☆91Updated 5 months ago
- MTab: Entity Search and Table Annotation with Wikidata, Wikipedia, and DBpedia☆31Updated 3 years ago
- Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering☆30Updated 2 years ago
- A tool for extracting arbitrary tables from untagged PDF documents☆39Updated 4 years ago
- A set of Python scripts for preprocessing the Wikidata JSON dump and running simple queries in an efficient manner.☆130Updated 10 months ago
- CODEC is a document and entity ranking dataset that focuses on complex essay-style topics.☆17Updated 8 months ago
- Publicly released code for the LAMBERT model☆103Updated 4 years ago
- ☆20Updated 2 years ago
- ☆81Updated 3 years ago
- Two approaches for robust TableQA: 1) ITR is a general-purpose retrieval-based approach for handling long tables in TableQA transformer m…☆39Updated 2 years ago
- [SIGIR 2021] Retrieving Complex Tables with Multi-Granular Graph Representation Learning.☆48Updated 2 years ago
- A Human-LLM Collaborative Dataset for Generative Information-seeking with Attribution☆35Updated 2 years ago
- Knowledge base construction from raw scientific documents☆38Updated last month
- ReFinED is an efficient and accurate entity linking (EL) system.☆219Updated 8 months ago
- Implementation of paper: HLATR: Enhance Multi-stage Text Retrieval with Hybrid List Aware Transformer Reranking☆72Updated 2 years ago
- A Benchmark of PDF Information Extraction Tools using a Multi-Task and Multi-Domain Evaluation Framework for Academic Documents☆26Updated 2 years ago
- The official repository for "Evaluating Entity Disambiguation and the Role of Popularity in Retrieval-Based NLP" published in ACL-IJNLP 2…☆20Updated 3 years ago
- JSON Schema format for storing datasets details, documents processed contents, and documents annotations in the document understanding do…☆13Updated 9 months ago
- The code related to the baselines from NeurIPS 2021 paper "DUE: End-to-End Document Understanding Benchmark."☆36Updated 2 years ago
- This is the code for our KILT leaderboard submissions (KGI + Re2G models).☆157Updated 3 months ago
- Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings (EMNLP 2022 paper)☆71Updated 2 years ago
- ☆58Updated 4 years ago