The repository provides links to collections of influential and interesting research papers from top AI conferences, with open-source code to promote reproducibility and provide detailed implementation insights beyond the scope of the article. Stay up to date with the latest advances in AI research!
☆119Oct 24, 2025Updated 7 months ago
Alternatives and similar repositories for NewEraAI-Papers
Users that are interested in NewEraAI-Papers are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- FG 2024 Papers: Explore a comprehensive collection of research papers presented at one of the premier conferences on automatic face and g…☆15May 18, 2024Updated 2 years ago
- ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore t…☆527May 5, 2025Updated last year
- Read articles, explore effectiveness metrics for speech enhancement methodologies. Seamlessly integrate code implementations for better u…☆27Apr 19, 2024Updated 2 years ago
- CVPR 2023-2024 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest d…☆451Jul 15, 2024Updated last year
- INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. …☆689Dec 25, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ISMIR 2023 Papers: A complete collection of influential and exciting research papers from the ISMIR 2023 conference.☆105Dec 2, 2023Updated 2 years ago
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆13Oct 11, 2022Updated 3 years ago
- [KDD 2026] Voxlect: A Speech Foundation Model Benchmark for Modeling Dialects and Regional Languages Around the Globe☆32Aug 10, 2025Updated 9 months ago
- Algorithms for Intelligent Assessment of Human Personality Traits based on His Multimodal Data for ranking potential candidates to perfo…☆65Dec 5, 2025Updated 5 months ago
- Multimodal Open Source Framework for Conversational Agent Research and Development.☆26Feb 16, 2025Updated last year
- This is the implementation of the manuscript "Learning General All-Neural Speech Enhancement based on Taylor's Approximation Theory", whi…☆14Nov 25, 2022Updated 3 years ago
- This repository contains the code for the paper "Exploiting Foundation Models and Speech Enhancement for Parkinson's Disease Detection fr…☆12Dec 19, 2025Updated 5 months ago
- This repository contains the code for the paper "Self-supervised Text Style Transfer using Cycle-Consistent Adversarial Networks".☆11Dec 2, 2024Updated last year
- CMU multilingual speech repository☆30Apr 15, 2022Updated 4 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Script to perform statistical significance test between ASR hypotheses.☆23Aug 13, 2017Updated 8 years ago
- This repo contains the code for "Voice Disorder Analysis: A Transformer-based Approach", accepted at Interspeech 2024☆15Jun 11, 2024Updated last year
- DSing ASR task: Resources and Baseline for an unaccompanied singing ASR.☆19Nov 23, 2021Updated 4 years ago
- Python library for calculating the mean opinion score and 95% confidence interval of the standard deviation of text-to-speech ratings acc…☆24Jan 31, 2025Updated last year
- Official PyTorch implementation for "Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Speech …☆33May 11, 2025Updated last year
- ACM MM 2022 - PPMN: Pixel-Phrase Matching Network for One-Stage Panoptic Narrative Grounding☆11Aug 12, 2022Updated 3 years ago
- Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSI…☆24Oct 8, 2025Updated 7 months ago
- This repository contains the official implementation of "A Benchmarking Study of Kolmogorov-Arnold Networks on Tabular Data" (under revie…☆17Jul 10, 2024Updated last year
- Official code for "DiffX: Guide Your Layout to Cross-Modal Generative Modeling"☆23Feb 20, 2025Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆64May 23, 2022Updated 4 years ago
- Multispeaker Community Vocoder Model for DiffSinger☆38Aug 11, 2025Updated 9 months ago
- This repository contains a short introduction on the topic of audio and speech processing -- from basics to applications.☆19Dec 20, 2023Updated 2 years ago
- Solos: A Dataset for Audio-Visual Music Analysis☆24Feb 17, 2023Updated 3 years ago
- Source Separation training codebase for the Sound Demixing Challenge 2023.☆45May 18, 2023Updated 3 years ago
- [Interspeech 2024] Enhancing Dysarthric Speech Recognition for Unseen Speakers via Prototype-Based Adaptation☆14Nov 28, 2024Updated last year
- Draco is a script to convert reddit thread to Org document☆10Aug 9, 2022Updated 3 years ago
- wake-up word emotion recognition [APSIPA 2022]☆17Nov 11, 2022Updated 3 years ago
- Official implementation for FlowSep☆74Jan 2, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- SimulEval: A General Evaluation Toolkit for Simultaneous Translation☆123Sep 13, 2024Updated last year
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆204Dec 13, 2024Updated last year
- Unsupervised Rhythm Modeling for Voice Conversion☆85Aug 3, 2023Updated 2 years ago
- Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'☆101Jul 24, 2024Updated last year
- This branch of Asteroid contains code for the vocal harmony and chamber ensemble separation related papers.☆12Nov 7, 2024Updated last year
- Musical Word Embedding for Music Tagging and Retrieval [IEEE TASLP]☆28Apr 23, 2024Updated 2 years ago
- ☆41Feb 16, 2022Updated 4 years ago