DanielLin94144/DUAL-textless-SQA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/DanielLin94144/DUAL-textless-SQA)

DanielLin94144 / DUAL-textless-SQA

Textless (ASR-transcript free) Spoken Question Answering. The official release of NMSQA dataset and the implementation of "DUAL: Textless Spoken Question Answering with Speech Discrete Unit Adaptive Learning" paper.

☆35

Alternatives and similar repositories for DUAL-textless-SQA

Users that are interested in DUAL-textless-SQA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

nervjack2 / Speech2Unit
View on GitHub
☆13Sep 25, 2024Updated last year
Splend1d / T5lephone
View on GitHub
Code for T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5
☆19Nov 29, 2022Updated 3 years ago
ckyang1124 / SAKURA
View on GitHub
Official GitHub repository for paper "SAKURA: On the Multi-hop Reasoning of Large Audio-Language Models Based on Speech and Audio Informa…
☆25Aug 14, 2025Updated 11 months ago
ga642381 / SpeechPrompt
View on GitHub
**Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…
☆102Apr 10, 2025Updated last year
jeffeuxMartin / meta-learning-hlp
View on GitHub
A publishing website of a table collecting meta-learning-related papers in the area of human language processing.
☆17Aug 2, 2021Updated 4 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
voidful / MMLM
View on GitHub
Toward Multi Modality Language Model - implementation of GPT-4o/Project Astra
☆16Dec 10, 2024Updated last year
d223302 / albert-embryology
View on GitHub
☆13Oct 28, 2020Updated 5 years ago
lingjzhu / spoken_sent_embedding
View on GitHub
Unsupervised spoken sentence embeddings
☆14Dec 14, 2022Updated 3 years ago
JSALT-2022-SSL / superb-prosody
View on GitHub
☆31Jul 13, 2023Updated 3 years ago
voidful / asrp
View on GitHub
ASR text preprocessing utility
☆21Aug 5, 2024Updated last year
dynamic-superb / dynamic-superb
View on GitHub
The official repository of Dynamic-SUPERB.
☆200Jun 24, 2025Updated last year
Chia-Hsuan-Lee / Spoken-SQuAD
View on GitHub
A spoken question answering dataset on SQUAD
☆52May 3, 2025Updated last year
ga642381 / Taiwanese-Speech-Synthesis
View on GitHub
Taiwanese Speech Synthesis with Tacotron2
☆26Oct 2, 2022Updated 3 years ago
TMMMU-Benchmark / evaluation
View on GitHub
Evaluation code for benchmarking VLMs in traditional chinese understanding
☆14Dec 22, 2025Updated 7 months ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
asappresearch / slue-toolkit
View on GitHub
A toolkit for Spoken Language Understanding Evaluation (SLUE) benchmark. Refer paper https://arxiv.org/abs/2111.10367 for more details. O…
☆65Feb 26, 2024Updated 2 years ago
roudimit / c2kd
View on GitHub
Code for the C2KD paper (ICASSP 2023)
☆20May 15, 2023Updated 3 years ago
voidful / vall-e-encodec
View on GitHub
☆41May 15, 2023Updated 3 years ago
MingLunHan / CIF-ColDec
View on GitHub
[ICASSP 2022] Improving End-to-End Contextual Speech Recognition with Fine-Grained Contextual Knowledge Selection
☆25Jul 14, 2026Updated last week
yaushian / CycleGAN-sentiment-transfer
View on GitHub
☆17Feb 28, 2018Updated 8 years ago
ORI-Muchim / BARK-RVC
View on GitHub
Multilingual-Speech-Synthesis-Voice-Conversion Using Bark + RVC
☆14Apr 19, 2025Updated last year
Sreyan88 / CompA
View on GitHub
Code for ICLR 2024 Paper: CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models
☆23Jul 10, 2024Updated 2 years ago
voidful / asr-trainer
View on GitHub
one script for xls-r/xlsr/whisper fine-tuning
☆42Jun 29, 2023Updated 3 years ago
atosystem / SpeechCLIP
View on GitHub
SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model, Accepted to IEEE SLT 2022
☆120Nov 25, 2022Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
bartlomiej-pluta / android-tts-server
View on GitHub
The Android application providing user with REST-based interface for utilizing built-in Android's TTS engine. The web service is highly c…
☆11Jul 28, 2020Updated 5 years ago
hsiehjackson / Mr.Right
View on GitHub
Mr. Right: Multimodal Retrieval on Representation of ImaGe witH Text
☆24Aug 15, 2022Updated 3 years ago
m3yrin / nar-latent-alignment
View on GitHub
Unofficial implementation of "Non-Autoregressive Machine Translation with Latent Alignments" https://arxiv.org/abs/2004.07437
☆23Jun 14, 2020Updated 6 years ago
ga642381 / FlappyBird
View on GitHub
Super Flappy Bird in p5.js
☆10Mar 8, 2021Updated 5 years ago
choijeongsoo / utut
View on GitHub
[TASLP 2024] Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation
☆31Sep 6, 2024Updated last year
EdwinYam / J-Net
View on GitHub
J-Net is aimed for audio separation with randomly weighted encoder.
☆12Oct 23, 2019Updated 6 years ago
acetylSv / non-parallel-rhythm-flexible-VC
View on GitHub
PyTorch implementation of: Rhythm-Flexible Voice Conversion without Parallel Data Using Cycle-GAN over Phoneme Posteriorgram Sequences
☆11Jul 18, 2019Updated 7 years ago
TTS-Research / PEL-TTS
View on GitHub
☆14Aug 16, 2023Updated 2 years ago
jjery2243542 / semi-supervised-ASR
View on GitHub
☆10Dec 16, 2018Updated 7 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
SpeechColab / GigaSpeech2
View on GitHub
An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement
☆198Apr 28, 2026Updated 2 months ago
lstrgar / ss-phoneme-seg
View on GitHub
Code for "Phoneme Segmentation Using Self-Supervised Speech Models", Strgar & Harwath, Proceedings of the IEEE Spoken Language Technology…
☆55Nov 4, 2022Updated 3 years ago
pohanchi / AALBERT
View on GitHub
The official repository for Audio ALBERT
☆68Jan 21, 2022Updated 4 years ago
jasonppy / PromptingWhisper
View on GitHub
Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation
☆151Jan 16, 2024Updated 2 years ago
ga642381 / Taiwanese-Translation
View on GitHub
Taiwanese Translation with BERT based model and RNN. Collection of Taiwanese text corpus
☆13Oct 15, 2022Updated 3 years ago
NTUCourse-Neo / ncn-frontend
View on GitHub
The frontend of NTUCourse Neo.
☆34Dec 23, 2022Updated 3 years ago
nervjack2 / MelHuBERT
View on GitHub
Official implementation of MelHuBERT
☆70Feb 21, 2026Updated 5 months ago