lzuwei/ip-avsr

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/lzuwei/ip-avsr)

lzuwei / ip-avsr

Audio Visual Speech Recognition

☆23

Alternatives and similar repositories for ip-avsr

Users that are interested in ip-avsr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

lzuwei / end-to-end-multiview-lipreading
View on GitHub
End to End Multiview Lip Reading
☆10Jan 26, 2018Updated 8 years ago
ajinkyaT / Lip_Reading_in_the_Wild_AVSR
View on GitHub
Audio-Visual Speech Recognition using Deep Learning
☆61Nov 14, 2018Updated 7 years ago
pandeydivesh15 / AVSR-Deep-Speech
View on GitHub
Google Summer of Code 2017 Project: Development of Speech Recognition Module for Red Hen Lab
☆44Aug 29, 2017Updated 8 years ago
LeeYongHyeok / DCM_vgg_transformer
View on GitHub
Dual cross modality attention audio-visual speech recognition model based on vgg transformer with hybrid CTC/attention architecture using…
☆14Jul 2, 2020Updated 6 years ago
lelechen63 / 3d_gan
View on GitHub
☆34Jul 25, 2018Updated 7 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
afperezm / acoustic-images-distillation
View on GitHub
Code for the paper: Audio-Visual Model Distillation Using Acoustic Images
☆21Mar 24, 2023Updated 3 years ago
georgesterpu / pyVSR
View on GitHub
Python toolkit for Visual Speech Recognition
☆37Jun 10, 2020Updated 6 years ago
georgesterpu / avsr-tf1
View on GitHub
Audio-Visual Speech Recognition using Sequence to Sequence Models
☆84Jul 10, 2020Updated 6 years ago
markusdr / transducersaurus
View on GitHub
Automatically exported from code.google.com/p/transducersaurus
☆11Apr 1, 2015Updated 11 years ago
hassanhub / LipReading
View on GitHub
☆64Oct 8, 2018Updated 7 years ago
ms-dot-k / Visual-Audio-Memory
View on GitHub
PyTorch implementation of "Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video" (ICCV2021)
☆22Apr 11, 2022Updated 4 years ago
idnavid / py_vad_tool
View on GitHub
python script for voice activity detection.
☆36Aug 16, 2024Updated last year
dolphin-li / DeformationTransferSameTopology
View on GitHub
☆12May 20, 2020Updated 6 years ago
renjiec / GLID
View on GitHub
GPU-Accelerated Locally Injective Shape Deformation
☆13Sep 26, 2017Updated 8 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
rs-dl / TSAN
View on GitHub
A Two Stage Adaptation Network (TSAN) for remote sensing images classification under single-source-mixed-multiple-target domain adaptatio…
☆16Jan 11, 2023Updated 3 years ago
Li-Sanze / ID-Card
View on GitHub
给定一张身份证正、反面，识别身份证上的所有文字信息
☆10Sep 4, 2019Updated 6 years ago
seanexp / LipMovement
View on GitHub
Detects lip movement and check if a person is speaking
☆19May 4, 2018Updated 8 years ago
weedwind / CTC-speech-recognition
View on GitHub
This is a working example of using CTC for phone recognition on TIMIT
☆50Oct 19, 2017Updated 8 years ago
JaesungBae / Speech-Command-Recognition-with-Capsule-Network
View on GitHub
Speech command recognition with capsule network & various NNs / KWS on Google Speech Command Dataset.
☆25Jan 28, 2019Updated 7 years ago
pmiller10 / frankenstein
View on GitHub
Machine Learning Framework
☆10Mar 17, 2016Updated 10 years ago
dimtzionas / HandObjectInteractionIJCV16_HandMotionViewer
View on GitHub
Hand MoCap 3d viewer for the IJCV'16 paper "Capturing Hands in Action using Discriminative Salient Points and Physics Simulation"
☆11May 19, 2016Updated 10 years ago
baopingli / scoreboard
View on GitHub
高级计算机体系结构记分牌算法实验
☆13Dec 22, 2018Updated 7 years ago
techmatt / actsynth
View on GitHub
Activity-centric Scene Synthesis for Functional 3D Scene Modeling
☆16Sep 6, 2015Updated 10 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
MontrealCorpusTools / speechcorpustools
View on GitHub
Easier analysis of large speech corpora
☆24Jun 22, 2021Updated 5 years ago
diegothomas / FaceCap
View on GitHub
This is source code for ArXiv paper: https://arxiv.org/pdf/2004.10557.pdf
☆14Jul 16, 2021Updated 5 years ago
davidjonas / MoCap
View on GitHub
Exploration of motion capture using a OptiTrack NatNet system
☆14Apr 20, 2021Updated 5 years ago
engyasin / EKF-MonoSLAM_for_3D-reconstruction
View on GitHub
Using MonoSLAM as starting step for 3D-reconstruction
☆11Aug 23, 2020Updated 5 years ago
lilianemomeni / KWS-Net
View on GitHub
Seeing Wake Words: Audio-visual Keyword Spotting
☆67Sep 16, 2020Updated 5 years ago
SSahuDS / Lipreading-Using-Mutimodal-Speech-Recognition
View on GitHub
Multimodal Speech Recognition for phoneme level prediction using Audio-Visual data from TCDTIMIT dataset implementing RNNs with LSTMs for…
☆15Jul 27, 2023Updated 2 years ago
luanshiyinyang / ChineseOCR
View on GitHub
端到端的中文场景文字识别。
☆12Jun 27, 2022Updated 4 years ago
Lenvia / RBM-BP-character-recognition
View on GitHub
RBM+BP神经网络识别手写数字和英文字符
☆11Mar 25, 2023Updated 3 years ago
CHoudrouge4 / SNNAHDD
View on GitHub
Scalable Nearest Neighbor Algorithms for High Dimensional Data
☆10May 10, 2019Updated 7 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
jinsongpan / ASR_Course_Homework
View on GitHub
分享在深蓝学院《语音识别：从入门到精通》第一期课程学习过程中完成的课后作业，供参考。
☆21Sep 13, 2020Updated 5 years ago
ichn-hu / DSP-Audio-Collector
View on GitHub
Web app created to collect audios for course project
☆10Apr 6, 2018Updated 8 years ago
chinedufn / mat4-to-dual-quat
View on GitHub
Convert a 4x4 matrix into a dual quaternion. Useful for skeletal animation (dual quaternion linear blending)
☆15Jul 4, 2017Updated 9 years ago
Zhong-master / PocketSphinx_Speech_Recognition
View on GitHub
PocketSphinx_Speech_Recognition
☆10Aug 5, 2021Updated 4 years ago
tstafylakis / Lipreading-ResNet
View on GitHub
Torch code for using Residual Networks with LSTMs for Lipreading
☆99Oct 8, 2018Updated 7 years ago
TimeChi / Lip_Reading_Competition
View on GitHub
2019年“创青春.交子杯”新网银行高校金融科技挑战赛-AI算法赛道比赛_代码分享
☆89Jul 15, 2020Updated 6 years ago
Mormukut11 / 3D-Machine-Learning-complete-Resources
View on GitHub
☆15Jun 15, 2018Updated 8 years ago