cvzoya / visuallydata
A large-scale infographics dataset from Visual.ly with metadata and additional crowdsourced annotations
☆13Updated 6 years ago
Related projects ⓘ
Alternatives and complementary repositories for visuallydata
- Release for CHART annotation tools used for ICDAR CHART 2019 competition☆26Updated last year
- A large-scale curated dataset of Visual.ly infographics with metadata and additional crowdsourced annotations for research applications i…☆29Updated 5 years ago
- [CVPR 2020] Transform and Tell: Entity-Aware News Image Captioning☆91Updated 7 months ago
- Novel Object Captioner - Captioning Images with diverse objects☆41Updated 6 years ago
- Document Visual Question Answering☆110Updated 4 years ago
- DVQA Dataset: A Bar chart question answering dataset presented at CVPR 2018☆32Updated 5 years ago
- Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-like Documents.☆16Updated last year
- Implementation of LaTr: Layout-aware transformer for scene-text VQA,a novel multimodal architecture for Scene Text Visual Question Answer…☆52Updated 3 weeks ago
- Implementation of seq2seq model for Visual Storytelling Challenge (VIST) http://visionandlanguage.net/VIST/index.html☆58Updated 6 years ago
- ECCV2020 paper: Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code and Data.☆85Updated last year
- Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps[AAAI2021]☆56Updated 2 years ago
- Official PyTorch Implementation of DocSynth: A Layout Guided Approach for Controllable Document Image Synthesis - ICDAR 2021☆71Updated 3 years ago
- Official code for paper "Spatially Aware Multimodal Transformers for TextVQA" published at ECCV, 2020.☆63Updated 3 years ago
- Image Captioning through Image Transformer☆40Updated 3 years ago
- This is a data repository for the ACL 2020 paper: "Let Me Choose: From Verbal Context to Font Selection"☆10Updated 4 years ago
- [ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos☆117Updated last year
- VisualMRC: Machine Reading Comprehension on Document Images (AAAI2021)☆51Updated 3 years ago
- Learning UI Similarity using Graph Networks☆35Updated 3 years ago
- PyTorch code for "Fine-grained Image Captioning with CLIP Reward" (Findings of NAACL 2022)☆235Updated 2 years ago
- Align and Prompt: Video-and-Language Pre-training with Entity Prompts☆185Updated 2 years ago
- A modular framework for Visual Question Answering research by the FAIR A-STAR team☆45Updated 3 years ago
- ☆26Updated 5 years ago
- VINS: Visual Search for Mobile User Interface Design☆31Updated 3 years ago
- ☆190Updated 6 months ago
- Implementation of CanvasVAE: Learning to Generate Vector Graphic Documents, ICCV 2021☆62Updated last year
- Flickr30K Entities Dataset☆166Updated 5 years ago
- ☆146Updated 2 years ago
- Repository for Multilingual-VQA task created during HuggingFace JAX/Flax community week.☆34Updated 3 years ago
- ☆9Updated 2 years ago
- Implementation of LayoutGAN https://arxiv.org/abs/1901.06767☆17Updated 5 years ago