Jiang-Yidi / FlatTrajectoryDistillation_FTD
The code of the paper "Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation" (CVPR2023)
☆18Updated last year
Alternatives and similar repositories for FlatTrajectoryDistillation_FTD:
Users that are interested in FlatTrajectoryDistillation_FTD are comparing it to the libraries listed below
- INTERSPEECH2023: Target Active Speaker Detection with Audio-visual Cues☆49Updated last year
- ☆12Updated 3 years ago
- [Interspeech 2024] LiteFocus is a tool designed to accelerate diffusion-based TTA model, now implemented with the base model AudioLDM2.☆33Updated 7 months ago
- ☆20Updated 5 months ago
- UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model☆21Updated 6 months ago
- Official implementation for AVGN☆34Updated last year
- Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)☆9Updated last year
- ☆14Updated 3 years ago
- Can audio-visual integration strengthen robustness under multimodal attacks?☆28Updated 2 years ago
- An official pytorch implementation of AAAI 2024 paper "Latent Space Editing in Transformer-based Flow Matching"☆36Updated 10 months ago
- ☆46Updated 2 years ago
- ☆11Updated last year
- Multimodal Variational Auto-encoder based Audio-Visual Segmentation [ICCV2023].☆19Updated 5 months ago
- [ICLR2025] γ -MOD: Mixture-of-Depth Adaptation for Multimodal Large Language Models☆31Updated 2 weeks ago
- The code of the paper "Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation" (CVPR2023)☆40Updated last year
- Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation☆24Updated 3 years ago
- (CVPR 2024) "Unsegment Anything by Simulating Deformation"☆26Updated 9 months ago
- ☆17Updated 3 years ago
- Official implementation of RAVEn (ICLR 2023) and BRAVEn (ICASSP 2024)☆62Updated this week
- This repo contains script to download MUSIC dataset from youtube☆8Updated last year
- [ICLR2022] Code for "Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph"☆54Updated 2 years ago
- Code for NeurIPS 2023 paper "DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation".☆61Updated 7 months ago
- Source code for the paper 'Audio Captioning Transformer'☆53Updated 3 years ago
- Source code for "Synchformer: Efficient Synchronization from Sparse Cues" (ICASSP 2024)☆48Updated 3 weeks ago
- [ACL 2023] PuMer: Pruning and Merging Tokens for Efficient Vision Language Models☆29Updated 5 months ago
- [Arxiv 2024] Official code for MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions☆30Updated 3 weeks ago
- ☆17Updated 9 months ago
- Vision Transformers are Parameter-Efficient Audio-Visual Learners☆99Updated last year
- ☆27Updated last year