沪语(上海话)ASR(语音识别)模型
☆28May 13, 2024Updated last year
Alternatives and similar repositories for asr
Users that are interested in asr are comparing it to the libraries listed below
Sorting:
- Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …☆10Nov 6, 2024Updated last year
- A chinese singing voice dataset, professional male singer, 105 songs, 132 minutes☆11Oct 19, 2023Updated 2 years ago
- b站视频音轨下载器(支持多P) Rebuild from https://github.com/Quandong-Zhang/bilibiliAudioDownloader.ps1 with python☆11Jul 31, 2025Updated 7 months ago
- ☆13Jan 11, 2026Updated last month
- Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation (ACM MM 2024)☆20Mar 17, 2025Updated 11 months ago
- Sequence alignement methods with helpers for PyTorch.☆24Nov 30, 2022Updated 3 years ago
- A piano music dataset with Audio, Symbolic and Text labels☆34Mar 6, 2025Updated 11 months ago
- An open-source Kazakh Emotional Text-to-Speech Dataset☆35Aug 1, 2025Updated 7 months ago
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- ☆14Sep 20, 2025Updated 5 months ago
- Official PyTorch implementation of the paper "Robust Training for Speaker Verification against Noisy Labels" in INTERSPEECH 2023.☆11Oct 23, 2023Updated 2 years ago
- 日本語音声に対して音素ラベルをアラインメントするためのツールです☆36Aug 19, 2025Updated 6 months ago
- ViSpeR: Multilingual Audio-Visual Speech Recognition☆56Apr 17, 2025Updated 10 months ago
- Code repository for ‘Adaptive Differential Denoising for Respiratory Sounds Classification’☆21Dec 19, 2025Updated 2 months ago
- This branch of Asteroid contains code for the vocal harmony and chamber ensemble separation related papers.☆12Nov 7, 2024Updated last year
- Resources for "Simple Speech Representation Learning from Perceptual Data".☆11Sep 18, 2023Updated 2 years ago
- ☆11Aug 11, 2023Updated 2 years ago
- Fine-Grained Knowledge Fusion for Retrieval-Augmented Medical Visual Question☆11Jul 18, 2024Updated last year
- ☆16Jun 12, 2025Updated 8 months ago
- xiaoyuzhou fm audio downloder.☆43Nov 20, 2025Updated 3 months ago
- Repository of ACL2023 paper: Unbalanced Optimal Transport for Unbalanced Word Alignment☆38Sep 13, 2023Updated 2 years ago
- ☆10Oct 14, 2020Updated 5 years ago
- JSGF Deducer based on JSGF grammar and WFST☆11Jan 11, 2018Updated 8 years ago
- text to speech☆10Mar 19, 2024Updated last year
- 美丽东自然语言处理百宝箱~命名实体识别,文本分类,语言模型,文本摘要 。☆10Nov 28, 2022Updated 3 years ago
- Public female English corpus used for Project AI❤dol☆14Dec 28, 2025Updated 2 months ago
- ☆11Jan 12, 2023Updated 3 years ago
- Phonemes and durations labeling based on whisper small☆11Jul 7, 2024Updated last year
- This is not remotely close to a finished product, and does not intend to nor does this claim to be working fine-tuning code for MaskGCT. …☆13Dec 4, 2024Updated last year
- 自然语言处理方面资料集☆10May 8, 2020Updated 5 years ago
- RIFE with IFUNet, FusionNet and RefineNet☆12Jun 30, 2022Updated 3 years ago
- Unsupervised Cross-lingual Sentiment Analysis (CoNLL 2019)☆10Nov 4, 2019Updated 6 years ago
- ☆11Nov 7, 2024Updated last year
- CodeReadingNote pro supports jetbrains22.1.4+, code remark, custom tags, tags grouping topic, ongoing maintenance☆12Updated this week
- HWFI: Hybrid Warping Fusion for Video Frame Interpolation. IJCV 2022☆11Sep 7, 2022Updated 3 years ago
- SimplifiedTransformer simplifies transformer block without affecting training. Skip connections, projection parameters, sequential sub-bl…☆15Feb 6, 2026Updated 3 weeks ago
- VIPNet: Visual Interaction Perceptual Network for Blind Image Quality Assessment☆11Dec 18, 2024Updated last year
- ☆13Nov 22, 2022Updated 3 years ago
- This is our Final Year Project titled " Implementation of seam carving for image retargeting using CUDA enabled GPU"☆11Nov 16, 2024Updated last year