IrisLi17 / Sound_of_Pixels
Course project of Introduction to Visual and Audio System. Pytorch implementation of "The Sound of Pixels".
☆8Updated 6 years ago
Alternatives and similar repositories for Sound_of_Pixels
Users that are interested in Sound_of_Pixels are comparing it to the libraries listed below
Sorting:
- ☆42Updated 4 years ago
- Official Repository of NeurIPS2021 paper: PTR☆33Updated 3 years ago
- ChatGPT - Review & Rebuttal: A browser extension for generating reviews and rebuttals, powered by ChatGPT. 利用 ChatGPT 生成审稿意见和回复的浏览器插件☆250Updated 2 years ago
- [ICLR2024] The official implementation of paper "UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling", by …☆74Updated last year
- ☆19Updated last year
- Code for Discriminative Sounding Objects Localization (NeurIPS 2020)☆57Updated 3 years ago
- Code for Visual Sound Localization in the Wild by Cross-Modal Interference Erasing (AAAI 2022).☆29Updated 3 years ago
- [CVPR 2023] iQuery: Instruments as Queries for Audio-Visual Sound Separation☆66Updated last year
- Download scripts and tools for Replay dataset.☆32Updated last year
- Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation☆25Updated 3 years ago
- A curated list about Awesome Embodied AI works and is still in construct. Now it contains a list of Simulators, Tasks and Datasets.☆31Updated 4 years ago
- Speech2Vec Reality Check☆82Updated 2 years ago
- Codebase for the paper "Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation" (ECCV2020)☆72Updated 4 years ago
- a thin wrapper of chatgpt for improving paper writing.☆254Updated 2 years ago
- Codebase for the paper "Visually Informed Binaural Audio Generation without Binaural Audios" (CVPR 2021)☆66Updated 3 years ago
- ☆16Updated 2 years ago
- ☆38Updated 3 years ago
- Keras implement of Finite Scalar Quantization☆71Updated last year
- A collection of 3D vision and language (e.g., 3D Visual Grounding, 3D Question Answering and 3D Dense Caption) papers and datasets.☆97Updated 2 years ago
- [CVPR'23 Highlight] AutoAD: Movie Description in Context.☆96Updated 6 months ago
- ICLR2023 statistics☆60Updated last year
- TokenBridge: Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation. https://yuqingwang1029.github.io/TokenBridge☆112Updated last week
- Code for Motion-aware Contrastive Video Representation Learning via Foreground-background Merging (CVPR 2022)☆47Updated last year
- 2.5D visual sound dataset☆99Updated 3 years ago
- A paper list that includes world models or generative video models for embodied agents.☆23Updated 4 months ago
- A Pytorch Implementation of Finite Scalar Quantization☆132Updated last year
- SAAVN Code release for paper "Sound Adversarial Audio-Visual Navigation,ICLR2022" (In PyTorch)☆19Updated 2 years ago
- ElasticTok: Adaptive Tokenization for Image and Video☆67Updated 6 months ago
- ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning. In ICCV, 2021.☆57Updated 3 years ago
- [ACCV 2024] Official Implementation of "AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description". Junyu Xie, Tengda Han, M…☆25Updated 3 months ago