IrisLi17 / Sound_of_PixelsLinks
Course project of Introduction to Visual and Audio System. Pytorch implementation of "The Sound of Pixels".
☆8Updated 6 years ago
Alternatives and similar repositories for Sound_of_Pixels
Users that are interested in Sound_of_Pixels are comparing it to the libraries listed below
Sorting:
- ☆19Updated 2 years ago
- a thin wrapper of chatgpt for improving paper writing.☆254Updated 2 years ago
- ☆44Updated 4 years ago
- SAAVN Code release for paper "Sound Adversarial Audio-Visual Navigation,ICLR2022" (In PyTorch)☆19Updated 2 years ago
- ☆39Updated 3 years ago
- Official Repository of NeurIPS2021 paper: PTR☆33Updated 3 years ago
- ChatGPT - Review & Rebuttal: A browser extension for generating reviews and rebuttals, powered by ChatGPT. 利用 ChatGPT 生成审稿意见和回复的浏览器插件☆252Updated 2 years ago
- ☆42Updated last year
- A curated list about Awesome Embodied AI works and is still in construct. Now it contains a list of Simulators, Tasks and Datasets.☆31Updated 4 years ago
- ☆22Updated 2 years ago
- This repository is a collection of research papers on World Models.☆39Updated last year
- Know What and Know Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation☆17Updated 3 years ago
- Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces☆78Updated 2 months ago
- Code for "Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes"☆54Updated last year
- [CVPR 2023] iQuery: Instruments as Queries for Audio-Visual Sound Separation☆68Updated 2 years ago
- Code for paper "Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning"☆44Updated last year
- Main repo for SimWorld simulator.☆59Updated last month
- ☆29Updated 3 years ago
- [CVPR'24 Highlight] The official code and data for paper "EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Lan…☆61Updated 4 months ago
- [ICLR 2023] SQA3D for embodied scene understanding and reasoning☆143Updated last year
- ☆21Updated 2 years ago
- A Pytorch Implementation of Finite Scalar Quantization☆146Updated last year
- EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation☆110Updated last week
- IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks☆57Updated 10 months ago
- MiroTrain is an efficient and algorithm-first system for post-training large agentic models.☆30Updated this week
- Official repo for EscapeCraft (an 3D environment for room escape) and benchmark MM-Escape. This work is accepted by ICCV 2025.☆27Updated last month
- Speech2Vec Reality Check☆83Updated 2 years ago
- Sample LaTex file for HKU PhD thesis.☆26Updated 3 years ago
- [CVPR 2024] Data and benchmark code for the EgoExoLearn dataset☆65Updated 11 months ago
- Code accompanying our ECCV-2020 paper on 3D Neural Listeners.☆132Updated 4 years ago