FreedomIntelligence/S2S-Arena

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/FreedomIntelligence/S2S-Arena)

FreedomIntelligence / S2S-Arena

☆21

Alternatives and similar repositories for S2S-Arena

Users that are interested in S2S-Arena are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

FreedomIntelligence / MTalk-Bench
View on GitHub
MTalk-Bench: Evaluating Speech-to-Speech Models in Multi-Turn Dialogues via Arena-style and Rubrics Protocols
☆20Nov 19, 2025Updated 8 months ago
npujcong / Chinese_PSP
View on GitHub
Chinese Prosodic Structure Prediction
☆10May 18, 2019Updated 7 years ago
ASLP-lab / MSU-Bench
View on GitHub
Open repository of "MSU-Bench: Towards Understanding the Conversational Multi-Speaker Scenarios"
☆17Jul 7, 2026Updated 2 weeks ago
Anonymous1252022 / Megatron-DeepSpeed
View on GitHub
☆18Sep 22, 2024Updated last year
OmniMMI / OpenOmniNexus
View on GitHub
a fully open-source implementation of a GPT-4o-like speech-to-speech video understanding model.
☆38Apr 7, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
hmohebbi / disentangling_representations
View on GitHub
☆14Oct 3, 2025Updated 9 months ago
FreedomIntelligence / ExpressiveSpeech
View on GitHub
☆17Jun 10, 2026Updated last month
FreedomIntelligence / Soundwave
View on GitHub
The official Soundwave repository
☆223Mar 16, 2025Updated last year
thuhcsi / SnakeGAN
View on GitHub
Please visit https://thuhcsi.github.io/SnakeGAN/
☆37Apr 25, 2023Updated 3 years ago
ASLP-lab / FastTurn
View on GitHub
☆33May 19, 2026Updated 2 months ago
khfs / DuplexMamba
View on GitHub
☆18Mar 6, 2026Updated 4 months ago
zxzhao0 / C2SER
View on GitHub
We propose C2SER, a novel audio-language model designed to enhance the stability and accuracy of speech emotion recognition through conte…
☆49Mar 3, 2025Updated last year
ASLP-lab / OSUM-Pangu
View on GitHub
An Open-Source Multidimension Speech Understanding Foundation Model Built upon OpenPangu on Ascend NPUs
☆33Mar 15, 2026Updated 4 months ago
audiosae / audio-sae
View on GitHub
Demo for AudioSAE paper
☆15Apr 26, 2026Updated 2 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
fgnt / speaker_reassignment
View on GitHub
Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment
☆14Feb 5, 2025Updated last year
XiangLi2022 / CM-TTS
View on GitHub
[Findings of NAACL 2024] Source code of paper CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers a…
☆68Mar 31, 2024Updated 2 years ago
IiuZiKai / Evo_TSE
View on GitHub
☆17Apr 9, 2026Updated 3 months ago
FreedomIntelligence / GPT-API-Accelerate
View on GitHub
The "GPT-API-Accelerate" project provides a set of Python classes for accelerating the process of generating responses to prompts using t…
☆23Oct 12, 2024Updated last year
FreedomIntelligence / PrinciplismQA
View on GitHub
This repository includes the full release of PrinciplismQA dataset and assessment scripts.
☆16Updated this week
isjwdu / DFADD
View on GitHub
Official Implementation and Dataset of paper - DFADD: The Diffusion and Flow-matching based Audio Deepfake Dataset
☆16Apr 7, 2025Updated last year
GLJS / AudioToolAgent
View on GitHub
GitHub repository for AudioToolAgent
☆20Feb 13, 2026Updated 5 months ago
kaihuhuang / Language-Group
View on GitHub
☆11Dec 24, 2024Updated last year
lavendery / UUG
View on GitHub
☆21Sep 14, 2025Updated 10 months ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
DCGM / SoftCTC
View on GitHub
This repository contains source codes for SoftCTC. Original paper can be found here: https://arxiv.org/abs/2212.02135
☆19Mar 7, 2023Updated 3 years ago
b-sigpro / sed-hsmm
View on GitHub
Onset-and-Offset-Aware Sound Event Detection
☆21Feb 10, 2025Updated last year
wangers / subtools2
View on GitHub
egrecho project
☆11Apr 30, 2026Updated 2 months ago
BUTSpeechFIT / SOT-DiCoW
View on GitHub
Multi-talker ASR based on DiCoW with Serialized Output Training
☆20Sep 18, 2025Updated 10 months ago
yanghaha0908 / WavCube
View on GitHub
Official code for "WavCube: Unifying Speech Representation for Understanding and Generation via Semantic-Acoustic Joint Modeling"
☆62Jun 27, 2026Updated 3 weeks ago
vTAD2025-Challenge / vTAD
View on GitHub
☆15Oct 24, 2025Updated 8 months ago
justinlovelace / SESD
View on GitHub
☆61Oct 28, 2024Updated last year
v-nhandt21 / ViMFA
View on GitHub
Montreal Forced Aligner for Vietnamese
☆15Oct 23, 2023Updated 2 years ago
Honee-W / U-SAM
View on GitHub
Official repository for U-SAM (Interspeech 2025)
☆28Jun 3, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
yangdongchao / Omni-AutoThink
View on GitHub
Adaptive Multimodal Reasoning via Reinforcement Learning
☆23Jan 11, 2026Updated 6 months ago
Ruiqi-Yan / URO-Bench
View on GitHub
Towards Comprehensive Evaluation for End-to-End Spoken Dialogue Models
☆55Sep 2, 2025Updated 10 months ago
Yuanshi9815 / LiteFocus
View on GitHub
[Interspeech 2024] LiteFocus is a tool designed to accelerate diffusion-based TTA model, now implemented with the base model AudioLDM2.
☆34Mar 11, 2025Updated last year
Ashigarg123 / ShiftySpeech
View on GitHub
☆15Jul 24, 2025Updated 11 months ago
daanzu / wenet_stt_python
View on GitHub
☆33Nov 27, 2021Updated 4 years ago
xiaomi-research / dasheng-audiogen
View on GitHub
end-to-end text to audio scene generation model
☆50Jun 16, 2026Updated last month
SJTU-OmniAgent / VocalNet
View on GitHub
☆123May 18, 2026Updated 2 months ago