h-munakata/Lighthouse-Wrapper-for-Audio-Moment-Retrieval

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/h-munakata/Lighthouse-Wrapper-for-Audio-Moment-Retrieval)

h-munakata / Lighthouse-Wrapper-for-Audio-Moment-Retrieval

☆13

Alternatives and similar repositories for Lighthouse-Wrapper-for-Audio-Moment-Retrieval

Users that are interested in Lighthouse-Wrapper-for-Audio-Moment-Retrieval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Sreyan88 / ReCLAP
View on GitHub
☆33Dec 23, 2025Updated 7 months ago
wonjune-kang / expressive-speech-retrieval
View on GitHub
Expressive Speech Retrieval using Natural Language Descriptions of Speaking Style
☆15Aug 18, 2025Updated 11 months ago
fgnt / sed_scores_eval
View on GitHub
☆41Feb 18, 2026Updated 5 months ago
Sreyan88 / RECAP
View on GitHub
Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning
☆16Jun 23, 2024Updated 2 years ago
BUTSpeechFIT / SOT-DiCoW
View on GitHub
Multi-talker ASR based on DiCoW with Serialized Output Training
☆21Sep 18, 2025Updated 10 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
taishi-n / torchrir
View on GitHub
PyTorch-based room impulse response (RIR) simulation toolkit with dynamic scenes, GPU acceleration.
☆23Jul 23, 2026Updated last week
Huntersxsx / TSGV-Learning-List
View on GitHub
Temporal Sentence Grounding in Videos / Natural Language Video Localization / Video Moment Retrieval的相关工作
☆31Mar 4, 2022Updated 4 years ago
Sreyan88 / CompA
View on GitHub
Code for ICLR 2024 Paper: CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models
☆23Jul 10, 2024Updated 2 years ago
kjw11 / CSEnet-ASR
View on GitHub
Cross-Speaker Encoding Network for Multi-talker Speech Recognition
☆12Mar 14, 2025Updated last year
JusperLee / Gull-Codec-Training
View on GitHub
☆12Mar 11, 2025Updated last year
microsoft / NoAudioCaptioning
View on GitHub
Repository for "Training Audio Captioning Models without Audio"
☆10Sep 26, 2023Updated 2 years ago
merlresearch / sebbs
View on GitHub
Prediction of sound event bounding boxes (SEBBs)
☆35Aug 2, 2024Updated last year
ETH-DISCO / sao-instruct
View on GitHub
Official repo for SAO-Instruct: Free-form Audio Editing using Natural Language Instructions presented at NeurIPS 2025
☆18Oct 28, 2025Updated 9 months ago
duyichao / NPDA-KNN-ST
View on GitHub
Official implementation of EMNLP'2022 paper "Non-Parametric Domain Adaptation for End-to-End Speech Translation"
☆11Oct 26, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Aisaka0v0 / TS-Whisper
View on GitHub
☆33Jun 12, 2025Updated last year
slSeanWU / beats-conformer-bart-audio-captioner
View on GitHub
PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Superv…
☆41Jan 6, 2024Updated 2 years ago
skit-ai / N-Best-ASR-Transformer
View on GitHub
Code for ACL-IJCNLP 2021 paper "N-Best-ASR-Transformer: Enhancing SLU Performance using Multiple ASR Hypotheses."
☆17Nov 30, 2021Updated 4 years ago
schufo / tisms
View on GitHub
This is the code of the ICASSP 2020 paper "Joint phoneme alignment and text-informed speech separation on highly corrupted speech"
☆16Apr 8, 2024Updated 2 years ago
fakerybakery / simpletts
View on GitHub
A lightweight Python library for running TTS models with a unified API.
☆20Feb 18, 2025Updated last year
audiodemo / voice-conversion
View on GitHub
Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
☆17Aug 18, 2023Updated 2 years ago
rithiksachdev / PostASR-Correction-SLT2024
View on GitHub
☆18Jul 22, 2024Updated 2 years ago
ArenAcikgoz / Whisper-Alignment
View on GitHub
Forced alignment decoder for Whisper.
☆16Mar 13, 2024Updated 2 years ago
tzyll / ChineseHP
View on GitHub
Dataset for Pinyin Regularization in Error Correction for Chinese Speech Recognition with Large Language Models in Interspeech 2024.
☆16Jul 4, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
zeyuxie29 / AudioTime
View on GitHub
☆39Jul 4, 2024Updated 2 years ago
line / WaveTrainerFit
View on GitHub
Official implementation of "Wave-Trainer-Fit: Neural Vocoder with Trainable Prior and Fixed-Point Iteration towards High-Quality Speech G…
☆16Feb 6, 2026Updated 5 months ago
amritkromana / disfluency_detection_from_audio
View on GitHub
☆35Aug 22, 2024Updated last year
Xianchao-Wu / wenet-deep-sparse-conformer
View on GitHub
☆15Aug 25, 2022Updated 3 years ago
unilight / jatts
View on GitHub
JATTS: A modern, research-oriented Japanese Text-to-speech Open-sourced Toolkit
☆44Mar 13, 2026Updated 4 months ago
interactive-cookbook / ara
View on GitHub
Corpus and code for Aligned Recipe Actions (ARA) corpus, EMNLP 2021
☆10May 22, 2024Updated 2 years ago
thamquocdung / eCMU
View on GitHub
eCMU: An Efficient Phase-aware Framework for Music Source Separation with Conformer (IEEE RIVF23)
☆10Oct 30, 2024Updated last year
alexisdmacintyre / SpeechBreathingToolbox
View on GitHub
Tools for the automatic detection of speech-related inhalation events and characterisation of the speech respiratory cycle.
☆11Feb 17, 2024Updated 2 years ago
ductuantruong / speaker_age_estimation_ssl_study
View on GitHub
[APSIPA'22] Exploring Speaker Age Estimation on Different Self-Supervised Learning Models
☆14Oct 19, 2022Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
merlresearch / tssep
View on GitHub
TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings
☆43Oct 27, 2025Updated 9 months ago
EndlessReform / smoltts
View on GitHub
Open TTS models, built for streaming on the edge
☆45Mar 16, 2025Updated last year
Aratako / CALM-DACVAE
View on GitHub
An attempt to reproduce CALM (Continuous Audio Language Models) using DACVAE as the audio VAE.
☆18Feb 20, 2026Updated 5 months ago
lizhaoqing / UNISON
View on GitHub
☆43Jun 3, 2026Updated last month
ORI-Muchim / Efficient-Speech
View on GitHub
Lightweight Korean TTS Model based on FastSpeech2
☆15Mar 4, 2026Updated 4 months ago
KrishnaDN / E2E_ASR_Confidence_Estimation
View on GitHub
Implementation of the paper "Confidence estimation for attention based sequence to sequence models for speech recognition"
☆16May 9, 2021Updated 5 years ago
gzhu06 / Cacophony
View on GitHub
Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986
☆49Jan 19, 2026Updated 6 months ago