wenet-e2e/wesubtitle

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/wenet-e2e/wesubtitle)

wenet-e2e / wesubtitle

用 OCR 提取视频硬字幕

☆90

Alternatives and similar repositories for wesubtitle

Users that are interested in wesubtitle are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

pengzhendong / ngram-punctuator
View on GitHub
An N-gram punctuator for Chinese and English.
☆18Oct 14, 2025Updated 9 months ago
k2-fsa / kaldi-decoder
View on GitHub
Decoders from Kaldi using OpenFst
☆35Apr 10, 2026Updated 3 months ago
Slyne / ctc_decoder
View on GitHub
A ctc decoder for both online and offline asr model
☆66Nov 18, 2023Updated 2 years ago
wenet-e2e / wesignal
View on GitHub
Production first, nn-based on-device signal processing toolkit.
☆63May 30, 2023Updated 3 years ago
kamilakesbi / DiarizersLM
View on GitHub
☆15Jul 16, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
backspacetg / distilXLSR
View on GitHub
Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
☆13Mar 30, 2025Updated last year
L6-NLP / Generative-Annotation-NEC
View on GitHub
Generative_Annotation_NEC: A novel NEC method that utilizes speech sound features to retrieve candidate entities and a generative method …
☆17Dec 2, 2025Updated 7 months ago
pigzach / MagicSpeechASR
View on GitHub
magicspeech competition recipe
☆18Jun 29, 2020Updated 6 years ago
yucongzh / online_speaker_diarization
View on GitHub
☆15Jul 11, 2022Updated 4 years ago
acids-ircam / Expressive_WAE_FADER
View on GitHub
companion repository to the DAFx-19 paper "Assisted Sound Sample Generation with Musical Conditioning in Adversarial Auto-Encoders" by Ad…
☆11Jun 22, 2019Updated 7 years ago
facebookresearch / MMCSG
View on GitHub
This repository contains the baseline system for CHiME-8 MMCSG challenge focusing on transcribing both sides of a conversation where one …
☆41Mar 13, 2024Updated 2 years ago
wenet-e2e / wesr
View on GitHub
We Speech Transcript based on LLM, in 300 lines of code.
☆182Jun 20, 2025Updated last year
XinyuZhou2000 / Spoken-Dialogue
View on GitHub
☆18Dec 7, 2023Updated 2 years ago
pengzhendong / audiolab
View on GitHub
A streaming audio reader, processor, and writer built on top of soundfile, and PyAV (bindings for FFmpeg)
☆39Mar 31, 2026Updated 3 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
mispchallenge / misp2021_baseline
View on GitHub
☆29Jun 15, 2022Updated 4 years ago
pengzhendong / torchfa
View on GitHub
Torch Audio Forced Aligner for Mixed Chinese (Mandarin or Cantonese) and English.
☆61Sep 5, 2025Updated 10 months ago
pkufool / simple-wer
View on GitHub
A simple command line tool to calculate WER for ASR.
☆14Updated this week
lourson1091 / audiobertscore
View on GitHub
☆15Nov 10, 2025Updated 8 months ago
levtelyatnikov / radiomixer
View on GitHub
radiomixer
☆14Feb 16, 2022Updated 4 years ago
xing96 / MIM-lipreading
View on GitHub
Code and model for paper <Mutual Information Maximization for Effective Lip Reading>
☆19Sep 4, 2020Updated 5 years ago
pengzhendong / g2p-mix
View on GitHub
Grapheme-to-Phoneme for Mixed Chinese (Mandarin or Cantonese) and English.
☆115Updated this week
ms-dot-k / LRW_ID
View on GitHub
The speaker-labeled information of LRW dataset, which is the outcome of the paper "Speaker-adaptive Lip Reading with User-dependent Paddi…
☆10Oct 12, 2023Updated 2 years ago
tzyll / ChineseHP
View on GitHub
Dataset for Pinyin Regularization in Error Correction for Chinese Speech Recognition with Large Language Models in Interspeech 2024.
☆16Jul 4, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
liuhuang31 / g2pw_once
View on GitHub
G2pw's inference speed is accelerated by about 8-10 times. Change loop generated predictive data to only once and model loop prediction b…
☆14Dec 30, 2023Updated 2 years ago
MaxMax2016 / Grad-TTS-Chinese
View on GitHub
Huawei Grad-TTS for Chinese
☆50Sep 26, 2023Updated 2 years ago
csukuangfj / kaldilm
View on GitHub
Python wrapper for kaldi's arpa2fst
☆38Aug 27, 2025Updated 11 months ago
wonjune-kang / expressive-speech-retrieval
View on GitHub
Expressive Speech Retrieval using Natural Language Descriptions of Speaking Style
☆15Aug 18, 2025Updated 11 months ago
mozillazg / pypinyin-g2pW
View on GitHub
基于 g2pW 提升 pypinyin 的准确性
☆105Jun 24, 2023Updated 3 years ago
wenet-e2e / wetts
View on GitHub
Production First and Production Ready End-to-End Text-to-Speech Toolkit
☆416Nov 20, 2025Updated 8 months ago
fanlu / wenet
View on GitHub
Transformer based ASR Engine.
☆13Aug 23, 2021Updated 4 years ago
MagicHub-io / MagicData-RAMC
View on GitHub
MagicData-RAMC Dataset and Baseline
☆64Sep 13, 2022Updated 3 years ago
xingchensong / TouchNet
View on GitHub
A native-PyTorch library for large scale M-LLM (text/audio) training with tp/cp/dp.
☆233Jul 2, 2026Updated 3 weeks ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
whzikaros / g2pL
View on GitHub
The implementation of g2pL with a new open dataset.
☆16May 14, 2023Updated 3 years ago
pengzhendong / wetext
View on GitHub
Python runtime for WeTextProcessing (does not depend on Pynini)
☆53Jun 11, 2026Updated last month
corticph / error-align
View on GitHub
Text-to-text alignment algorithm for speech recognition error analysis.
☆32Jun 23, 2026Updated last month
MrSupW / ContextASR-Bench
View on GitHub
A Massive Contextual Speech Recognition Benchmark.
☆107Aug 6, 2025Updated 11 months ago
k2-fsa / text_search
View on GitHub
Some fast-ish algorithms for batch text search in moderate-sized collections, intended for data cleanup
☆79Jun 30, 2025Updated last year
LAION-AI / emotional-speech-annotations
View on GitHub
This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models
☆35Oct 13, 2024Updated last year
lovemefan / telespeech-asr-python
View on GitHub
☆68Jul 17, 2024Updated 2 years ago