ALM-LAB / PACE
PACE (Podcast AI for Chapters and Episodes) is a semantic search engine that helps you find the information you need, inter- and intra-podcasts (Project for the AssemblyAI Winter 2022 Hackathon).
☆14Updated 2 years ago
Alternatives and similar repositories for PACE:
Users that are interested in PACE are comparing it to the libraries listed below
- Speaker Diarization with Transformers☆64Updated 9 months ago
- ☆62Updated 7 months ago
- Joint speech-language model - respond directly to audio!☆30Updated 9 months ago
- Repository contains code to fine-tune WhisperASR model☆23Updated 2 years ago
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆143Updated last year
- ☆153Updated last year
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.☆15Updated 3 months ago
- ☆11Updated 2 years ago
- Code for the INTERSPEECH 2023 paper "Learning When to Speak: Latency and Quality Trade-offs for Simultaneous Speech-to-Speech Translation…☆30Updated last month
- ☆72Updated this week
- Pre-training BART model for the Italian Language☆15Updated 2 years ago
- ITALIC: An ITALian Intent Classification Dataset☆12Updated last year
- ☆18Updated 2 years ago
- ☆14Updated last year
- ☆17Updated 9 months ago
- The official implementation of ImageBind-LLM and Whisper-LLM from the paper "Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Compre…☆19Updated last year
- Datasets for turn-taking research☆12Updated last year
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆61Updated this week
- babyLM WhisBERT code☆18Updated 9 months ago
- Repository having the code and models from the paper: data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student traini…☆11Updated 11 months ago
- Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities☆111Updated 2 months ago
- [Interspeech 2024] Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation☆140Updated 2 weeks ago
- This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" and "Efficient Fi…☆36Updated 7 months ago
- Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning☆11Updated 8 months ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆56Updated 7 months ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆80Updated last year
- ☆44Updated 2 years ago
- Data and code for the paper "NormBank: A Knowledge Bank of Situational Social Norms"☆24Updated last year
- Official implementation of USR (NeurIPS 2024)☆29Updated 2 months ago