0nutation / Lookup-Free-Quantization
☆25Updated last year
Related projects ⓘ
Alternatives and complementary repositories for Lookup-Free-Quantization
- ☆101Updated 4 months ago
- ☆35Updated 4 months ago
- [Interspeech 2024] LiteFocus is a tool designed to accelerate diffusion-based TTA model, now implemented with the base model AudioLDM2.☆33Updated 3 months ago
- A toolkit for computing Fréchet Inception Distance (FID) & Fréchet Video Distance (FVD) metrics.☆11Updated last year
- An official pytorch implementation of AAAI 2024 paper "Latent Space Editing in Transformer-based Flow Matching"☆27Updated 6 months ago
- Efficient synchronization from sparse cues☆28Updated 6 months ago
- official code for Diff-Instruct algorithm for one-step diffusion distillation☆46Updated 7 months ago
- [Official Implementation] Acoustic Autoregressive Modeling 🔥☆57Updated 2 months ago
- Keras implement of Finite Scalar Quantization☆63Updated last year
- This repo contains the official PyTorch implementation of AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image …☆76Updated 4 months ago
- VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling☆19Updated 2 months ago
- TriNet: stabilizing self-supervised learning from complete or slow collapse on ASR.☆26Updated last year
- UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model☆17Updated 3 months ago
- ☆11Updated 3 months ago
- A torch-based implementation of K-Means and K-Means++☆17Updated 3 years ago
- [ICCV'21] The Right to Talk: An Audio-Visual Transformer Approach☆20Updated 3 years ago
- Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation (ACM MM 2024)☆14Updated last month
- A Pytorch Implementation of Finite Scalar Quantization☆80Updated 11 months ago
- ☆30Updated 3 weeks ago
- Source code for the paper 'Audio Captioning Transformer'☆50Updated 2 years ago
- The official repository of SpeechCraft dataset, a large-scale expressive bilingual speech dataset with natural language descriptions.☆49Updated last month
- ☆25Updated 3 months ago
- [ICCV 2023] Online Clustered Codebook☆145Updated last month
- An implementation of simple diffusion in PyTorch (and JAX)☆35Updated last year
- [ICLR2022] Code for "Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph"☆53Updated 2 years ago
- SimVQ: Addressing Representation Collapse in Vector Quantized Models with One Linear Layer☆90Updated this week
- Code for NeurIPS 2023 paper "DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation".☆60Updated 3 months ago
- Pytorch implementation for “V2C: Visual Voice Cloning”☆30Updated last year