WeNet 实战课程作业
☆20Oct 7, 2022Updated 3 years ago
Alternatives and similar repositories for wenet_in_action_homework
Users that are interested in wenet_in_action_homework are comparing it to the libraries listed below
Sorting:
- Open Source Speech/Text Data on AI☆19Sep 13, 2022Updated 3 years ago
- 开源语音识别自定义数据模型训练指南☆13Oct 8, 2023Updated 2 years ago
- ☆13Mar 30, 2023Updated 2 years ago
- 达摩fsmn vad c++推理服务☆18Apr 17, 2023Updated 2 years ago
- 5Hz Deep-Compression Speech VAE for AR-Diffusion and CALMs☆57Nov 19, 2025Updated 3 months ago
- simple energy vad☆19Jun 3, 2017Updated 8 years ago
- Audio streaming transfer demo with google.api.HttpBody and grpc gateway for speech synthesis☆20Jan 28, 2020Updated 6 years ago
- streaming attention networks for end-to-end automatic speech recognition☆55May 6, 2020Updated 5 years ago
- video cut powered by AI☆24Nov 15, 2022Updated 3 years ago
- silero-vad pytorch implement☆36Nov 23, 2024Updated last year
- (WIP)long form speech generatoins☆31Apr 2, 2025Updated 11 months ago
- Open-Source Turn-Taking Detection Model and Dataset for Full-Duplex Spoken Dialogue Systems☆82Jan 25, 2026Updated last month
- A unified tokenizer that is capable of both extracting semantic information and enabling high-fidelity audio reconstruction.☆134Sep 19, 2025Updated 5 months ago
- simple dnn based vad☆70Dec 2, 2018Updated 7 years ago
- Speech Emotion Recognition using Deep Learning☆12May 24, 2021Updated 4 years ago
- 基于SSM的书店图书销售管理系统3具有两种个角色,分别为管理员和用户,具体功能如下: 管理员:书籍的增删改查、书籍类型的增删改查、用户的增删改查、订单审核、订单详情查看等功能 用户:书籍的模糊查询、购买数据、购物车、结算、注册登录等功能☆10Jan 11, 2024Updated 2 years ago
- Tensorflow Implementation of WaveGlow☆37May 4, 2020Updated 5 years ago
- A python implementation of the Griffin Lim Algorithm for audio reconstruction from magnitudes☆34Jan 17, 2024Updated 2 years ago
- ☆37Jul 4, 2024Updated last year
- Modern, fast and ergonomic C++ HTTP/1.1, HTTP/2 and WebSocket server library for Linux, perfect for microservices.☆34Updated this week
- WavBench: Benchmarking Reasoning, Colloquialism, and Paralinguistics for End-to-End Spoken Dialogue Models☆27Feb 13, 2026Updated 3 weeks ago
- [WACV 2025] Cross-Task Affinity Learning for Multitask Dense Scene Predictions☆11Jun 12, 2025Updated 8 months ago
- Python wrapper for kaldi's arpa2fst☆38Aug 27, 2025Updated 6 months ago
- 轻量级模型,检测5种水果,包含750张自制水果数据集。☆14Jul 17, 2022Updated 3 years ago
- Filter Banks, Fast Python Implementation☆42Jul 9, 2022Updated 3 years ago
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆196Dec 13, 2024Updated last year
- An UWP client software for ASRT speech recognition system. 一个可用于ASRT语音识别系统的UWP客户端软件☆12Oct 23, 2019Updated 6 years ago
- open-source Mandarian biased word dataset☆14Sep 21, 2023Updated 2 years ago
- Automatically setup the AISHELL-4 and MSDWild dataset for usage with pyannote-database (and pyannote-audio)☆15Oct 22, 2025Updated 4 months ago
- 在RK3588上实现的yolov5+sort目标检测与跟踪(c++版本)☆12May 28, 2024Updated last year
- 🔥flutter富文本编辑器,功能丰富,界面漂亮,基于flutter-quill开发,支持@用户、超链接、高亮显示等☆12Jan 17, 2026Updated last month
- Onset-and-Offset-Aware Sound Event Detection☆21Feb 10, 2025Updated last year
- 在rv1126上实现的yolov5+Deepsort进行行人检测和跟踪☆15May 31, 2024Updated last year
- Colab notebooks for d2l-book☆11Dec 5, 2019Updated 6 years ago
- ☆10May 23, 2023Updated 2 years ago
- A library of speech gadgets.☆14Oct 15, 2022Updated 3 years ago
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval☆13Jun 27, 2025Updated 8 months ago
- rkllm_talking is a standalone compiled voice communication system based on a large model || rkllm_talking 是一个独立编译的基于大模…☆13Oct 13, 2024Updated last year
- A lovely structopt library for C++! Parse command line arguments by defining a struct! ❤️☆11Apr 24, 2023Updated 2 years ago