LeeYongHyeok/DCM_vgg_transformer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/LeeYongHyeok/DCM_vgg_transformer)

LeeYongHyeok / DCM_vgg_transformer

Dual cross modality attention audio-visual speech recognition model based on vgg transformer with hybrid CTC/attention architecture using fairseq

☆14

Alternatives and similar repositories for DCM_vgg_transformer

Users that are interested in DCM_vgg_transformer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

lightning830 / E2E-audio-speech-recognition
View on GitHub
Conformer encoder + Transformer decoder with Hybrid CTC/attention
☆12Nov 11, 2021Updated 4 years ago
SSahuDS / Lipreading-Using-Mutimodal-Speech-Recognition
View on GitHub
Multimodal Speech Recognition for phoneme level prediction using Audio-Visual data from TCDTIMIT dataset implementing RNNs with LSTMs for…
☆15Jul 27, 2023Updated 3 years ago
lzuwei / end-to-end-multiview-lipreading
View on GitHub
End to End Multiview Lip Reading
☆10Jan 26, 2018Updated 8 years ago
ms-dot-k / Visual-Audio-Memory
View on GitHub
PyTorch implementation of "Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video" (ICCV2021)
☆22Apr 11, 2022Updated 4 years ago
eastonYi / end-to-end_asr_pytorch
View on GitHub
Implements of CTC, Speech-Transformer and CIF for end-to-end speech recognition with pytorch
☆23Jul 28, 2020Updated 6 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
lzuwei / ip-avsr
View on GitHub
Audio Visual Speech Recognition
☆23Aug 9, 2017Updated 8 years ago
georgesterpu / Taris
View on GitHub
Transformer-based online speech recognition system with TensorFlow 2
☆26Jan 22, 2021Updated 5 years ago
georgesterpu / avsr-tf1
View on GitHub
Audio-Visual Speech Recognition using Sequence to Sequence Models
☆84Jul 10, 2020Updated 6 years ago
mayank-git-hub / ETE-Speech-Recognition
View on GitHub
Implementation of Hybrid CTC/Attention Architecture for End-to-End Speech Recognition in pure python and PyTorch
☆26Jul 25, 2024Updated 2 years ago
choijeongsoo / lip2speech-unit
View on GitHub
[Interspeech 2023] Intelligible Lip-to-Speech Synthesis with Speech Units
☆47Oct 26, 2024Updated last year
baoy-nlp / DSS-VAE-pytorch
View on GitHub
Generating Sentences from Disentangled Syntactic and Semantic Spaces
☆11Jun 24, 2019Updated 7 years ago
prajwalkr / transpotter
View on GitHub
Official implementation of Transpotter, published in BMVC 2021
☆16Aug 6, 2022Updated 3 years ago
Bucknalla / lopy-raspberrypi
View on GitHub
🎮 Use a Raspberry Pi to control a LoPy over UART
☆12Mar 9, 2017Updated 9 years ago
diaoenmao / Speech-Emotion-Recognition-with-Dual-Sequence-LSTM-Architecture
View on GitHub
[ICASSP 2020] Speech Emotion Recognition with Dual-Sequence LSTM Architecture
☆12Jan 17, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
lemmonation / fcl-nat
View on GitHub
Code for "Fine-Tuning by Curriculum Learning for Non-Autoregressive Neural Machine Translation"
☆13Jul 10, 2020Updated 6 years ago
georgesterpu / pyVSR
View on GitHub
Python toolkit for Visual Speech Recognition
☆37Jun 10, 2020Updated 6 years ago
bagustris / SER_ICSigSys2019
View on GitHub
Repository of code for Speech emotion recognition using voiced speech and attention model, submitted to ICSigSys 2019
☆13Jan 6, 2020Updated 6 years ago
wuyinwuxian / Neural_Network_optimization_method
View on GitHub
这是一个Matlab代码，里面包括五种常见神经网络优化算法的对比。包括SGD、SGDM、Adagrad、AdaDelta、Adam
☆11Mar 23, 2022Updated 4 years ago
shleee47 / Sound-Source-Localization
View on GitHub
Sound Source Localization for AI Grand Challenge 2021
☆21Feb 7, 2022Updated 4 years ago
jarret / raspi-uart-waveshare
View on GitHub
A library for interfacing with the 4.3inch UART e-Paper from a Raspberry Pi 2/3 via Python3 with example programs to display QR Codes for…
☆12Mar 9, 2019Updated 7 years ago
TarekVito / ColorCoherenceVector
View on GitHub
Color Coherence Vector is a powerful color-based image retrieval (Matlab)
☆11Feb 27, 2015Updated 11 years ago
foriamweak / FastAffineMotion
View on GitHub
Official git for "Fast Affine Motion Estimation for Versatile Video Coding (VVC) Encoding"
☆11Sep 14, 2020Updated 5 years ago
BitFloyd / Shot_Segmentation
View on GitHub
Project to segment video stream into separate shots
☆13Oct 30, 2018Updated 7 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
lshiwjx / deformable-3d-convnets
View on GitHub
Deformable 3D ConvNets for Action Recognition
☆10Jan 21, 2018Updated 8 years ago
shleee47 / mpWAV-Sound-Source-Localization
View on GitHub
Sound Source Localization for AI Grand Challenge 2021
☆22Feb 8, 2022Updated 4 years ago
saschaschramm / MonteCarloTreeSearch
View on GitHub
This project applies Monte Carlo Tree Search (MCTS) to a simple grid world.
☆10May 30, 2018Updated 8 years ago
artem179 / WLAS
View on GitHub
The implementation of 'Watch, Listen, Attend and Spell’ (WLAS) network that learns to transcribe videos of mouth motion to character on p…
☆11Mar 23, 2018Updated 8 years ago
Joyce94 / BiLSTM-CRF-pytorch
View on GitHub
☆11Aug 8, 2018Updated 7 years ago
jakopy / Project-Best-Cafe-Arena-Simulation-Software-
View on GitHub
☆11Apr 3, 2017Updated 9 years ago
rs-dl / TSAN
View on GitHub
A Two Stage Adaptation Network (TSAN) for remote sensing images classification under single-source-mixed-multiple-target domain adaptatio…
☆16Jan 11, 2023Updated 3 years ago
pangjh3 / AnLLM
View on GitHub
☆20Jun 17, 2024Updated 2 years ago
Li-Sanze / ID-Card
View on GitHub
给定一张身份证正、反面，识别身份证上的所有文字信息
☆10Sep 4, 2019Updated 6 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
mdangschat / speech-corpus-dl
View on GitHub
Download and preperation tool for free speech corpora.
☆16Apr 28, 2019Updated 7 years ago
initc / bert-fairseq
View on GitHub
Implement BERT and MulitPointer-generator on the basis of fairseq
☆13Oct 6, 2022Updated 3 years ago
tsiangleo / TensorFlowMnist
View on GitHub
☆15Apr 27, 2017Updated 9 years ago
zhengzx-nlp / MGNMT
View on GitHub
☆15Oct 19, 2021Updated 4 years ago
PietroAvolio / camera-motion-estimation
View on GitHub
Implementation of a Preemptive RANSAC algorithm for Camera Motion Estimation
☆15Jun 16, 2017Updated 9 years ago
singaln / GPT2-chinese-chatbot
View on GitHub
利用GPT2实现的闲聊模型
☆12Apr 22, 2021Updated 5 years ago
bjxia1 / wechat_official_account_picture_downloader
View on GitHub
微信公众号图片下载器
☆14Jun 30, 2019Updated 7 years ago