Sid2697 / Word-recognition-EmbedNet-CAB
Code implementation for our ICPR, 2020 paper titled "Improving Word Recognition using Multiple Hypotheses and Deep Embeddings"
☆21Updated 3 years ago
Alternatives and similar repositories for Word-recognition-EmbedNet-CAB
Users that are interested in Word-recognition-EmbedNet-CAB are comparing it to the libraries listed below
Sorting:
- Code implementation for our DAS, 2020 paper titled "Fused Text Recogniser and Deep Embeddings Improve Word Recognition and Retrieval"☆15Updated 9 months ago
- Labeled Movie Trailer Dataset☆16Updated 7 years ago
- An easy-to-use app to visualise attentions of various VQA models.☆41Updated 2 years ago
- Code for our ICCC'19 paper - "Trick or TReAT : Thematic Reinforcement for Artistic Typography"☆19Updated 3 years ago
- 12-in-1: Multi-Task Vision and Language Representation Learning Web Demo☆35Updated 2 years ago
- A collection of multimodal datasets, and visual features for VQA and captionning in pytorch. Just run "pip install multimodal"☆82Updated 3 years ago
- A unified framework to jointly model images, text, and human attention traces.☆78Updated 3 years ago
- Used LSTM on Flickr dataset☆12Updated 7 years ago
- ☆44Updated 3 years ago
- Source code for "Bi-modal Transformer for Dense Video Captioning" (BMVC 2020)☆227Updated 2 years ago
- COMIC: This is the code repo of our TMM2019 work titled "COMIC: Towards a Compact Image Captioning Model with Attention".☆15Updated 3 years ago
- "LipNet: End-to-End Sentence-level Lipreading" in PyTorch☆70Updated 5 years ago
- Video captioning baseline models on Video2Commonsense Dataset.☆56Updated 4 years ago
- Implementation of the paper "Stacked Attention Networks for Image Question Answering" in Tensorflow☆13Updated 5 years ago
- Code to train and evaluate the GeNeVA-GAN model for the GeNeVA task proposed in our ICCV 2019 paper "Tell, Draw, and Repeat: Generating a…☆85Updated 2 years ago
- Real-world photo sequence question answering system (MemexQA). CVPR'18 and TPAMI'19☆32Updated 5 years ago
- [CVPR 2019] Pytorch code for Audio Visual Scene-Aware Dialog☆34Updated 4 years ago
- ☆37Updated 3 years ago
- AViD Dataset: Anonymized Videos from Diverse Countries☆56Updated 2 years ago
- [EMNLP 2018] PyTorch code for TVQA: Localized, Compositional Video Question Answering☆178Updated 2 years ago
- ☆28Updated 5 years ago
- Code for our paper: *Shamsian, *Kleinfeld, Globerson & Chechik, "Learning Object Permanence from Video"☆68Updated 5 months ago
- Implementations of Transformers for Video☆23Updated 4 years ago
- Here we describe a new approach to train a video captioning neural network , that is not only based on the normal cross entropy loss for …☆7Updated 5 years ago
- Official code for the paper "Visual Speech Enhancement Without A Real Visual Stream" published at WACV 2021☆107Updated 11 months ago
- Repository for ACL2020 paper "Refer360° A Referring Expression Recognition Dataset in 360°Images"☆13Updated 3 years ago
- EgoCom: A Multi-person Multi-modal Egocentric Communications Dataset☆56Updated 4 years ago
- Shapley values for assessing the importance of each frame in a video☆17Updated 4 years ago
- A PyTorch implementation of the paper Generative Adversarial Text-to-Image Synthesis☆25Updated 5 years ago
- Tooling to play around with multilingual machine translation for Indian Languages.☆22Updated 3 years ago