Sid2697 / Word-recognition-EmbedNet-CABLinks
Code implementation for our ICPR, 2020 paper titled "Improving Word Recognition using Multiple Hypotheses and Deep Embeddings"
☆21Updated 4 years ago
Alternatives and similar repositories for Word-recognition-EmbedNet-CAB
Users that are interested in Word-recognition-EmbedNet-CAB are comparing it to the libraries listed below
Sorting:
- Code implementation for our DAS, 2020 paper titled "Fused Text Recogniser and Deep Embeddings Improve Word Recognition and Retrieval"☆15Updated last year
- Explores jigsaw puzzles solvinig as pre-text task for fine grained classification for bird species identification (Implemented with pyTor…☆21Updated 5 years ago
- A Keras Implementation of Supervised Video Summarization using Attention Based Encoder-Decoder Networks☆29Updated 3 years ago
- Collection of useful FFMPEG commands for processing audio and video files.☆44Updated 6 years ago
- [CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations☆563Updated last year
- A unified framework to jointly model images, text, and human attention traces.☆78Updated 4 years ago
- A neural network architecture(CNN+LSTM) that automatically generates captions from the images. The model uses ResNet architecture to trai …☆25Updated 5 years ago
- Unofficial Implementation of Google Deepmind's paper `Objects that Sound`☆83Updated 7 years ago
- Source code for "Bi-modal Transformer for Dense Video Captioning" (BMVC 2020)☆228Updated 2 years ago
- Image Captioning: Implementing the Neural Image Caption Generator with python☆63Updated 7 years ago
- An easy-to-use app to visualise attentions of various VQA models.☆41Updated 2 years ago
- Code for our Source-free Unsupervised Video Domain Adaptation Paper☆10Updated 6 months ago
- Generic framework for ML projects☆19Updated 2 years ago
- ☆38Updated 3 years ago
- State of the Art Language models and Classifier for Hindi language (spoken in Indian sub-continent)☆123Updated 5 years ago
- ☆17Updated 3 years ago
- 12-in-1: Multi-Task Vision and Language Representation Learning Web Demo☆35Updated 2 years ago
- CNN+LSTM, Attention based, and MUTAN-based models for Visual Question Answering☆75Updated 5 years ago
- Deep Audio-Visual Embedding network (DAVEnet) implementation in PyTorch☆65Updated 6 years ago
- Code to train and evaluate the GeNeVA-GAN model for the GeNeVA task proposed in our ICCV 2019 paper "Tell, Draw, and Repeat: Generating a…☆85Updated 2 years ago
- Audio Visual Instance Discrimination with Cross-Modal Agreement☆129Updated 3 years ago
- [EMNLP 2018] PyTorch code for TVQA: Localized, Compositional Video Question Answering☆179Updated 2 years ago
- Shapley values for assessing the importance of each frame in a video☆17Updated 4 years ago
- Lip Reading in the Wild using ResNet and LSTMs in PyTorch☆58Updated 7 years ago
- A Dataset for Grounded Video Description☆162Updated 3 years ago
- Labeled Movie Trailer Dataset☆16Updated 7 years ago
- Real-world photo sequence question answering system (MemexQA). CVPR'18 and TPAMI'19☆32Updated 6 years ago
- Video Captioning is an encoder decoder mode based on sequence to sequence learning☆136Updated last year
- PyTorch implementation of DRAW: A Recurrent Neural Network For Image Generation trained on Devanagari dataset.☆89Updated 4 years ago
- Code for the cocktail-party problem of isolating and enhancing the speech for the target speaker☆17Updated 3 years ago