Alittleegg/Eureka-Audio

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Alittleegg/Eureka-Audio)

Alittleegg / Eureka-Audio

Eureka-Audio: A 1.7B lightweight audio–language model that matches 7B–30B models on ASR, audio understanding, and paralinguistic reasoning.

☆40

Alternatives and similar repositories for Eureka-Audio

Users that are interested in Eureka-Audio are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ASLP-lab / OSUM-Pangu
View on GitHub
An Open-Source Multidimension Speech Understanding Foundation Model Built upon OpenPangu on Ascend NPUs
☆33Mar 15, 2026Updated 4 months ago
colaudiolab / AudioSet-R
View on GitHub
Official implementation: "AudioSet-R: A Refined AudioSet with Multi-Stage LLM Label Reannotation"
☆19Oct 9, 2025Updated 9 months ago
pengzhendong / torchfa
View on GitHub
Torch Audio Forced Aligner for Mixed Chinese (Mandarin or Cantonese) and English.
☆61Sep 5, 2025Updated 10 months ago
roudimit / Omni-R1
View on GitHub
[ASRU 2025] Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?
☆47Nov 21, 2025Updated 8 months ago
pengzhendong / ngram-punctuator
View on GitHub
An N-gram punctuator for Chinese and English.
☆18Oct 14, 2025Updated 9 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
xiaomi-research / dasheng-tokenizer
View on GitHub
State-of-the-art continious audio tokenization
☆40Mar 9, 2026Updated 4 months ago
ASLP-lab / Hum-Dial
View on GitHub
ICASSP2026 HumDial Challenge
☆50May 28, 2026Updated last month
xiquan-li / Resonate
View on GitHub
[INTERSPEECH 2026] Pre-training, SFT, DPO and GRPO for Text-to-Audio Generation
☆48Apr 17, 2026Updated 3 months ago
pengzhendong / wavesurfer
View on GitHub
For audio visualization and playback in Jupyter notebooks.
☆18Nov 25, 2025Updated 7 months ago
yfyeung / DS-WED
View on GitHub
[ICASSP 2026] Official code for "Measuring Prosody Diversity in Zero-Shot TTS: A New Metric, Benchmark, and Exploration"
☆17Apr 16, 2026Updated 3 months ago
IiuZiKai / Evo_TSE
View on GitHub
☆17Apr 9, 2026Updated 3 months ago
BUTSpeechFIT / SOT-DiCoW
View on GitHub
Multi-talker ASR based on DiCoW with Serialized Output Training
☆20Sep 18, 2025Updated 10 months ago
ryuclc / CosyVoice2-GRPO
View on GitHub
A simple implementation for improving CosyVoice2 by GRPO method
☆39May 5, 2026Updated 2 months ago
ASLP-lab / Easy-Turn
View on GitHub
Open-Source Turn-Taking Detection Model and Dataset for Full-Duplex Spoken Dialogue Systems
☆121Jan 25, 2026Updated 5 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
leospark / FireRedVAD-Engineering
View on GitHub
Lightweight streaming Voice Activity Detection (VAD) tool with ONNX runtime
☆22Mar 18, 2026Updated 4 months ago
yanghaha0908 / WavCube
View on GitHub
Official code for "WavCube: Unifying Speech Representation for Understanding and Generation via Semantic-Acoustic Joint Modeling"
☆62Jun 27, 2026Updated 3 weeks ago
yangdongchao / Omni-AutoThink
View on GitHub
Adaptive Multimodal Reasoning via Reinforcement Learning
☆23Jan 11, 2026Updated 6 months ago
Mddct / cosyvoice2-flow-optimized
View on GitHub
faster inference
☆27Jan 20, 2025Updated last year
JHU-LCAP / FlexSED
View on GitHub
open-vocabulary sound event detection
☆53Dec 17, 2025Updated 7 months ago
Audio-Reasoning-Challenge / Audio-Reasoning-Challenge-Baselines
View on GitHub
The baselines of ARC-Challenge-Interspeech2026
☆60Dec 1, 2025Updated 7 months ago
EIT-NLP / LLaSO
View on GitHub
☆116Oct 21, 2025Updated 9 months ago
KeiKinn / ParaCLAP
View on GitHub
Towards a general language-audio model for computational paralinguistic tasks
☆30Dec 14, 2024Updated last year
Mddct / usm-tokenizer
View on GitHub
semantic tokenizer for speech and music
☆20Jul 6, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
yfyeung / CLSP
View on GitHub
[ACL 2026 Main] Open-Ended Speaking Style Modeling via Fine-Grained and Multi-Granular Contrastive Language-Speech Pre-training
☆104Apr 6, 2026Updated 3 months ago
b-sigpro / sed-hsmm
View on GitHub
Onset-and-Offset-Aware Sound Event Detection
☆21Feb 10, 2025Updated last year
MaikeZuefle / f-actor
View on GitHub
☆28Updated this week
JaesungHuh / ca-subtitle
View on GitHub
Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"
☆21Nov 3, 2025Updated 8 months ago
LAION-AI / emotion-annotations
View on GitHub
☆110Updated this week
llm-jp / llama-mimi
View on GitHub
Llama-Mimi is a speech language model that uses a unified tokenizer (Mimi) and a single Transformer decoder (Llama) to jointly model sequ…
☆31Sep 20, 2025Updated 10 months ago
pengzhendong / audiolab
View on GitHub
A streaming audio reader, processor, and writer built on top of soundfile, and PyAV (bindings for FFmpeg)
☆39Mar 31, 2026Updated 3 months ago
xzf-thu / Voices-in-the-Wild-Bench
View on GitHub
☆28May 22, 2026Updated last month
MRSAudio / MRSAudio_Main
View on GitHub
MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations
☆43Oct 15, 2025Updated 9 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Soul-AILab / SoulX-Duplug
View on GitHub
Plug-and-play streaming semantic VAD for real-time full-duplex spoken dialogue systems.
☆273Updated this week
HITsz-TMG / Lychee-FD
View on GitHub
Native End-to-End Full-Duplex Spoken Language Model
☆60Updated this week
DanielLin94144 / Full-Duplex-Bench
View on GitHub
A Benchmark for Evaluating Turn-Taking and Overlap Handling in Full-Duplex Spoken Dialogue Models
☆238May 20, 2026Updated 2 months ago
wenet-e2e / west
View on GitHub
We Speech Toolkit, LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction
☆206Updated this week
inclusionAI / MingTok-Audio
View on GitHub
☆88Feb 24, 2026Updated 4 months ago
Ruiqi-Yan / URO-Bench
View on GitHub
Towards Comprehensive Evaluation for End-to-End Spoken Dialogue Models
☆55Sep 2, 2025Updated 10 months ago
R1ckShi / FrontEnd-AEC
View on GitHub
Acoustic echo cancelation(AEC) is a main algorithm in the pipe line of acoustic devices with KWS or ASR. FNLMS is used.
☆19Apr 22, 2019Updated 7 years ago