Sid2697 / Word-recognition-and-retrieval
Code implementation for our DAS, 2020 paper titled "Fused Text Recogniser and Deep Embeddings Improve Word Recognition and Retrieval"
☆14Updated last month
Related projects: ⓘ
- Code implementation for our ICPR, 2020 paper titled "Improving Word Recognition using Multiple Hypotheses and Deep Embeddings"☆21Updated 3 years ago
- Code for "Weakly-supervised Fingerspelling Recognition in British Sign Language Videos", BMVC 2022.☆10Updated last year
- Official code for the paper "Visual Speech Enhancement Without A Real Visual Stream" published at WACV 2021☆102Updated 3 months ago
- menovideo: pytorch library for video action recognition and video understanding☆28Updated 2 years ago
- Download DeepMind's Kinetics dataset.☆19Updated 4 years ago
- Implementations of Transformers for Video☆24Updated 3 years ago
- Adds "face" bounding boxes to the COCO images dataset.☆45Updated 4 years ago
- Datasets, transforms and samplers for video in PyTorch☆86Updated 11 months ago
- Easiest way of fine-tuning HuggingFace video classification models☆131Updated last year
- My experimentation around action recognition in videos. Contains Keras implementation for C3D network based on original paper "Learning S…☆53Updated last month
- "LipNet: End-to-End Sentence-level Lipreading" in PyTorch☆64Updated 5 years ago
- Lip Reading in the Wild using ResNet and LSTMs in PyTorch☆58Updated 6 years ago
- Model submitted for the ICMI 2018 EmotiW Group-Level Emotion Recognition Challenge☆79Updated 5 years ago
- ☆20Updated 4 years ago
- PyTorch-based Ultimate Deep Learning Research Tool focusing on Video Action Recognition☆23Updated last year
- ☆37Updated 6 years ago
- End-to-end pipeline for lip reading at the word level using a tensorflow CNN implementation.☆33Updated 4 years ago
- Flickr Diverse Faces (FDF) is a dataset with 1.5M faces "in the wild".☆45Updated last year
- Unofficial Implementation of Google Deepmind's paper `Objects that Sound`☆83Updated 6 years ago
- For ICDAR 2019 Paper on End-to-end License Plate and Scene Text Recognition with multi-head attention models☆26Updated 3 years ago
- AViD Dataset: Anonymized Videos from Diverse Countries☆55Updated last year
- ☆24Updated 5 years ago
- A Comprehensive Tutorial on Video Modeling☆65Updated 4 years ago
- I3D implemetation in Keras + video preprocessing + visualization of results☆43Updated last year
- Code and models for evaluating a state-of-the-art lip reading network☆187Updated last year
- PyTorch implementation of DRAW: A Recurrent Neural Network For Image Generation trained on Devanagari dataset.☆89Updated 4 years ago
- EgoCom: A Multi-person Multi-modal Egocentric Communications Dataset☆52Updated 3 years ago
- Course Project for CS763 Computer Vision IIT Bombay☆33Updated 6 years ago
- Exploration of different solutions to action recognition in video, using neural networks implemented in PyTorch.☆179Updated 4 years ago
- Code for our paper: *Shamsian, *Kleinfeld, Globerson & Chechik, "Learning Object Permanence from Video"☆67Updated 9 months ago