due-benchmark / du-schemaLinks
JSON Schema format for storing datasets details, documents processed contents, and documents annotations in the document understanding domain.
☆13Updated 10 months ago
Alternatives and similar repositories for du-schema
Users that are interested in du-schema are comparing it to the libraries listed below
Sorting:
- The code related to the baselines from NeurIPS 2021 paper "DUE: End-to-End Document Understanding Benchmark."☆36Updated 2 years ago
- VisualMRC: Machine Reading Comprehension on Document Images (AAAI2021)☆56Updated 5 months ago
- A Multi-subject High School Examinations Dataset for Cross-lingual and Multilingual Question Answering☆44Updated 3 years ago
- [EMNLP 2021] LM-Critic: Language Models for Unsupervised Grammatical Error Correction☆120Updated 3 years ago
- ☆58Updated 4 years ago
- a large scientific paraphrase dataset for longer paraphrase generation☆39Updated 2 years ago
- Text Extraction Formulation + Feedback Loop for state-of-the-art WSD (EMNLP 2021)☆53Updated 3 years ago
- A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations☆56Updated 3 years ago
- Incorporating VIsual LAyout Structures for Scientific Text Classification☆179Updated 2 years ago
- Contrastive Fact Verification☆73Updated 3 years ago
- ☆92Updated 3 years ago
- PyTorch implementation and pre-trained models for ASP - Autoregressive Structured Prediction with Language Models, EMNLP 22. https://arxi…☆107Updated last year
- Code for NAACL 2021 full paper "Efficient Attentions for Long Document Summarization"☆67Updated 4 years ago
- Codes for ACL-IJCNLP 2021 Paper "Zero-shot Fact Verification by Claim Generation"☆65Updated 3 years ago
- ☆68Updated 4 months ago
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.☆79Updated 3 years ago
- The official code for PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization☆156Updated 2 years ago
- ☆67Updated 3 years ago
- Dataset for NAACL 2021 paper: "DART: Open-Domain Structured Data Record to Text Generation"☆155Updated 2 years ago
- Python source code for EMNLP 2021 Findings paper: "Subword Mapping and Anchoring Across Languages".☆13Updated 4 years ago
- [NAACL 2022] Robust (Controlled) Table-to-Text Generation with Structure-Aware Equivariance Learning.☆57Updated last year
- This repo supports various cross-lingual transfer learning & multilingual NLP models.☆92Updated 2 years ago
- Code for the CRAC 2021 paper "On Generalization in Coreference Resolution" (Best short paper award)☆35Updated 2 years ago
- Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"☆30Updated 3 years ago
- Efficient Memory-Augmented Transformers☆34Updated 2 years ago
- Knowledge Infused Decoding☆71Updated last year
- The autoregressive information extraction system GenIE (Generative Information Extraction) implemented in PyTorch.☆104Updated 2 years ago
- Repository for the paper "Named Entity Recognition for Entity Linking: What Works and What's Next" (EMNLP 2021).☆76Updated 3 years ago
- This is the code for the EMNLP2020 Finding paper "BERT for Monolingual and Cross-Lingual Reverse Dictionary"☆19Updated 4 years ago
- Simple Questions Generate Named Entity Recognition Datasets (EMNLP 2022)☆76Updated 2 years ago