Sid2697 / Word-recognition-and-retrieval
Code implementation for our DAS, 2020 paper titled "Fused Text Recogniser and Deep Embeddings Improve Word Recognition and Retrieval"
☆15Updated 6 months ago
Alternatives and similar repositories for Word-recognition-and-retrieval:
Users that are interested in Word-recognition-and-retrieval are comparing it to the libraries listed below
- Code implementation for our ICPR, 2020 paper titled "Improving Word Recognition using Multiple Hypotheses and Deep Embeddings"☆21Updated 3 years ago
- Official code for the paper "Visual Speech Enhancement Without A Real Visual Stream" published at WACV 2021☆104Updated 8 months ago
- Code for "Weakly-supervised Fingerspelling Recognition in British Sign Language Videos", BMVC 2022.☆11Updated last year
- Built a deep neural network that functions as part of an end-to-end automatic speech recognition (ASR) pipeline.☆48Updated 6 years ago
- A fully convolution-network for speech-to-text, built on pytorch.☆126Updated 4 years ago
- Labeled Movie Trailer Dataset☆16Updated 6 years ago
- ☆24Updated 6 years ago
- menovideo: pytorch library for video action recognition and video understanding☆28Updated 3 years ago
- Pytorch Code for S2IGAN☆41Updated 4 years ago
- Download specific objects from Open-Images Dataset☆37Updated 6 years ago
- Code for the cocktail-party problem of isolating and enhancing the speech for the target speaker☆17Updated 2 years ago
- PyTorch Implementation of "Facial Image-to-Video Translation by a Hidden Affine Transformation" in MM'19.☆55Updated 5 years ago
- A script to help you quickly build custom computer vision datasets☆34Updated 4 years ago
- This sample includes simeple CNN classifier for music and audio-folder dataloader just like ImageFolder in torchvision.☆11Updated 6 years ago
- "LipNet: End-to-End Sentence-level Lipreading" in PyTorch☆67Updated 5 years ago
- A deep learning model utilizing CNN and LSTM to recognize activity from video. (To be used for bench-marking hardware accelerator)☆8Updated 5 years ago
- Object Detection for wheat heads in the given images☆23Updated 4 years ago
- SpeechYOLO Interspeech 2019☆42Updated 2 years ago
- 5th Place Solution to 3rd YouTube-8M Video Understanding Challenge (Last Top GB Model)☆13Updated 5 years ago
- Representation for Handwritten Word Images☆24Updated 4 years ago
- Code and models for evaluating a state-of-the-art lip reading network☆193Updated last year
- Model submitted for the ICMI 2018 EmotiW Group-Level Emotion Recognition Challenge☆79Updated 6 years ago
- Unofficial Implementation of Google Deepmind's paper `Objects that Sound`☆83Updated 6 years ago
- Automated Lip Reading using Deep Reinforcement Learning☆30Updated 6 years ago
- Datasets, transforms and samplers for video in PyTorch☆87Updated last year
- Minimal implementation of Denoised Smoothing (https://arxiv.org/abs/2003.01908) in TensorFlow.☆20Updated 3 years ago
- Urban sound source tagging from an aggregation of four second noisy audio clips via 1D and 2D CNN (Xception)☆58Updated last year
- A non-JIT version implementation / replication of CLIP of OpenAI in pytorch☆34Updated 4 years ago
- ☆64Updated 6 years ago
- Best Collection of Articles and code for Audio Classification☆14Updated 5 years ago