chenpk00/IS2024_stream_decoder_only_asr

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/chenpk00/IS2024_stream_decoder_only_asr)

chenpk00 / IS2024_stream_decoder_only_asr

☆16

Alternatives and similar repositories for IS2024_stream_decoder_only_asr

Users that are interested in IS2024_stream_decoder_only_asr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

gengxuelong / wenet_LLM_from_ASLP
View on GitHub
wenet_LLM_from_ASLP
☆15Nov 26, 2024Updated last year
wonjune-kang / llm-speech-summarization
View on GitHub
Prompting Large Language Models with Audio for General-Purpose Speech Summarization
☆20May 14, 2025Updated last year
tomasJwYU / AutoPrepDemo
View on GitHub
AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech Data
☆36Dec 31, 2023Updated 2 years ago
backspacetg / distilXLSR
View on GitHub
Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
☆13Mar 30, 2025Updated last year
SpeechColab / GigaSpeechBench
View on GitHub
☆29Updated this week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Zildj1an / FIFO-Driver
View on GitHub
Character device driver working as FIFO pipe, created with a Linux Kernel module (SMP-Safe). Works on Android's kernel too.
☆13Jun 14, 2021Updated 5 years ago
ASLP-lab / FMSU-Bench
View on GitHub
Towards Fine-Grained Multi-Dimensional Speech Understanding: Data Pipeline, Benchmark, and Model
☆25May 21, 2026Updated 2 months ago
VITA-MLLM / Freeze-Omni
View on GitHub
✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM
☆388May 27, 2025Updated last year
ASLP-lab / Easy-Turn
View on GitHub
Open-Source Turn-Taking Detection Model and Dataset for Full-Duplex Spoken Dialogue Systems
☆122Jan 25, 2026Updated 6 months ago
Mddct / cosyvoice2-flow-optimized
View on GitHub
faster inference
☆27Jan 20, 2025Updated last year
bfs18 / rfwave
View on GitHub
☆152Apr 25, 2025Updated last year
y-ren16 / TiCodec
View on GitHub
☆81Aug 11, 2025Updated 11 months ago
ASLP-lab / Smart-Glass-Challenge
View on GitHub
☆17Jun 16, 2026Updated last month
xiaomi-research / dasheng-tokenizer
View on GitHub
State-of-the-art continious audio tokenization
☆40Mar 9, 2026Updated 4 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
t13m / kaldi-readers-for-tensorflow
View on GitHub
readers that enable reading kaldi ark in tensorflow
☆17Mar 7, 2018Updated 8 years ago
hyama5 / vae_align
View on GitHub
Alignment examples for Interspeech 2024
☆28Jul 5, 2024Updated 2 years ago
ASLP-lab / Automatic-Song-Aesthetics-Evaluation-Challenge
View on GitHub
☆15Dec 14, 2025Updated 7 months ago
ASLP-lab / C2SER
View on GitHub
We propose C2SER, a novel audio-language model designed to enhance the stability and accuracy of speech emotion recognition through conte…
☆17Mar 3, 2025Updated last year
vicgalle / refined-dpo
View on GitHub
Refined Direct Preference Optimization with Synthetic Data for Behavioral Alignment of LLMs
☆13Feb 13, 2024Updated 2 years ago
lars76 / forced-alignment-chinese
View on GitHub
Mandarin Chinese audio datasets aligned with Montreal Forced Aligner
☆19Aug 13, 2024Updated last year
pengzhendong / torchfa
View on GitHub
Torch Audio Forced Aligner for Mixed Chinese (Mandarin or Cantonese) and English.
☆61Sep 5, 2025Updated 10 months ago
xiaoxing2001 / DeGLA
View on GitHub
[ACM MM25] Official Pytorch implementation of [Decoupled Global-Local Alignment for Improving Compositional Understanding]
☆16Jul 15, 2025Updated last year
ASLP-lab / MSU-Bench
View on GitHub
Open repository of "MSU-Bench: Towards Understanding the Conversational Multi-Speaker Scenarios"
☆18Jul 7, 2026Updated 2 weeks ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ASLP-lab / SmartGlasses
View on GitHub
This challenge focuses on evaluating speech recognition and semantic understanding capabilities of AI glasses in complex real-world envir…
☆18Jun 27, 2026Updated 3 weeks ago
xinyebei / 2026_finvcup_baseline
View on GitHub
信也杯2026比赛baseline
☆15Jun 17, 2026Updated last month
pengzhendong / welm
View on GitHub
One command to build TLG.fst for WeNet.
☆30Oct 11, 2022Updated 3 years ago
anxiangsir / Video_Benchmark_Suite
View on GitHub
Video Benchmark Suite: Rapid Evaluation of Video Foundation Models
☆17Jan 10, 2025Updated last year
ddlBoJack / Awesome-Speech-Language-Model
View on GitHub
Paper, Code and Resources for Speech Language Model and End2End Speech Dialogue System.
☆201Jun 7, 2026Updated last month
huangruizhe / audio
View on GitHub
Data manipulation and transformation for audio signal processing, powered by PyTorch
☆10Sep 30, 2024Updated last year
ASLP-lab / LLaSE-G1
View on GitHub
LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement
☆47Mar 10, 2025Updated last year
tairenpiao / XNOR-popcount-GEMM-PyTorch-CPU-CUDA
View on GitHub
A PyTorch implemenation of real XNOR-popcount (1-bit op) GEMM Linear PyTorch extension support both CPU and CUDA
☆25Jun 6, 2023Updated 3 years ago
jishengpeng / Nucleic-acid-detection-system
View on GitHub
吉林大学软件工程软构件与中间件课设
☆15Aug 26, 2022Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
kyegomez / USM
View on GitHub
Implementation of Google's USM speech model in Pytorch
☆36Updated this week
ASLP-lab / LLaSA_Plus
View on GitHub
Llasa Speed Up
☆64Jan 18, 2026Updated 6 months ago
gary083 / GAN_Harmonized_with_HMMs
View on GitHub
Code：Completely Unsupervised Speech Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Mod…
☆25Dec 17, 2019Updated 6 years ago
jishengpeng / Design-compiler
View on GitHub
吉林大学编译原理课程设计，基于SNL语言完成词法分析，语法分析程序。
☆16Jun 20, 2022Updated 4 years ago
yunyikristy / skipNet
View on GitHub
☆12Oct 21, 2019Updated 6 years ago
lmxue / ICASSP2022_TTS_VC_Summary
View on GitHub
ICASSP2022 TTS&VC Summary
☆13Jun 9, 2022Updated 4 years ago
George0828Zhang / simulst
View on GitHub
PyTorch toolkit for streaming speech recognition, speech translation and simultaneous translation based on fairseq.
☆25Oct 3, 2022Updated 3 years ago