GoFigure-LANL / DeepPatent-dataset
Large-scale dataset of patent drawings and image retrieval baseline.
☆30Updated 2 years ago
Related projects: ⓘ
- ☆129Updated last year
- SciCap Dataset☆48Updated 2 years ago
- Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings (EMNLP 2022 paper)☆63Updated last year
- [ICDAR 2023] SelfDocSeg: A self-supervised vision-based approach towards Document Segmentation (Oral)☆36Updated 11 months ago
- ☆56Updated 9 months ago
- ☆33Updated last year
- [WWW 2022] Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations☆85Updated 2 years ago
- Incorporating VIsual LAyout Structures for Scientific Text Classification☆166Updated last year
- ☆58Updated last month
- The dataset contains 3 million attribute-value annotations across 1257 unique categories on 2.2 million cleaned Amazon product profiles. …☆136Updated last year
- Two approaches for robust TableQA: 1) ITR is a general-purpose retrieval-based approach for handling long tables in TableQA transformer m…☆29Updated last year
- Multimodal and multilingual topic model with pretrained embeddings☆10Updated last year
- Dataset, models, and code for paper "CiteSum: Citation Text-guided Scientific Extreme Summarization and Low-resource Domain Adaptation", …☆33Updated 2 years ago
- ReadingBank: A Benchmark Dataset for Reading Order Detection☆90Updated 3 weeks ago
- Implementation of ECIR 2022 Paper: How Can Graph Neural Networks Help Document Retrieval: A Case Study on CORD19 with Concept Map Generat…☆15Updated 2 years ago
- Research papers about Chain of Thought (CoT)☆32Updated 10 months ago
- Code for the NAACL 2022 long paper "DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings"☆290Updated last year
- Code for Relevance-guided Supervision for OpenQA with ColBERT (TACL'21)☆40Updated 3 years ago
- Code for ACL paper "Zero-Shot Text Classification via Self-Supervised Tuning"☆22Updated 11 months ago
- Code/data for MARG (multi-agent review generation)☆24Updated 4 months ago
- SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with Soft Negative Samples☆73Updated 2 years ago
- Code for ACL2023 paper: Pre-Training to Learn in Context☆106Updated last month
- ☆61Updated last year
- SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images (AAAI2023)☆73Updated 11 months ago
- Long-context pretrained encoder-decoder models☆95Updated last year
- Code for ECIR 2022 paper Local Citation Recommendation with Hierarchical-Attention Text Encoder and SciBERT-based Reranking☆20Updated last month
- ☆24Updated 3 months ago
- ☆56Updated last year
- Model zoo for topic models, neural topic models, contextual embeddings for topic models ...☆40Updated last year
- This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have b…☆65Updated 11 months ago