IrisLi17 / Sound_of_PixelsLinks

Course project of Introduction to Visual and Audio System. Pytorch implementation of "The Sound of Pixels".

☆8

Alternatives and similar repositories for Sound_of_Pixels

Users that are interested in Sound_of_Pixels are comparing it to the libraries listed below

Sorting:

yixchen / YouRefIt_ERU
☆19Updated 2 years ago
chengaopro / Awesome-EmbodiedAI
A curated list about Awesome Embodied AI works and is still in construct. Now it contains a list of Simulators, Tasks and Datasets.
☆31Updated 4 years ago
evelinehong / PTR
Official Repository of NeurIPS2021 paper: PTR
☆33Updated 3 years ago
ashawkey / chatgpt_please_improve_my_paper_writing
a thin wrapper of chatgpt for improving paper writing.
☆254Updated 2 years ago
yyf17 / SAAVN
SAAVN Code release for paper "Sound Adversarial Audio-Visual Navigation,ICLR2022" (In PyTorch)
☆19Updated 2 years ago
rhgao / ObjectFolder
ObjectFolder Dataset
☆160Updated 2 years ago
LinXueyuanStdio / chatgpt-review-rebuttal-extension
ChatGPT - Review & Rebuttal: A browser extension for generating reviews and rebuttals, powered by ChatGPT. 利用 ChatGPT 生成审稿意见和回复的浏览器插件
☆250Updated 2 years ago
JiabenChen / iQuery
[CVPR 2023] iQuery: Instruments as Queries for Audio-Visual Sound Separation
☆68Updated last year
Xianchao-Wu / perceiver-pytorch
☆44Updated 4 years ago
WellyZhang / ACRE
ACRE: Abstract Causal REasoning Beyond Covariation
☆19Updated 3 years ago
uiuc-robovision / hands-as-probes
☆25Updated 3 years ago
zfchenUnique / compositional_physics_learner
☆39Updated 2 years ago
NVlabs / causal_comp
☆22Updated 2 years ago
Chat-3D / Chat-3D
Code for "Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes"
☆54Updated last year
HanqingWangAI / SSM-VLN
Code and Data for our CVPR 2021 paper "Structured Scene Memory for Vision-Language Navigation"
☆39Updated 3 years ago
OpenGVLab / VeBrain
Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces
☆73Updated last month
agrimgupta92 / maskvit
☆73Updated 3 years ago
referit3d / referit3d
Code accompanying our ECCV-2020 paper on 3D Neural Listeners.
☆129Updated 4 years ago
PatrickHua / Awesome-World-Models
This repository is a collection of research papers on World Models.
☆39Updated last year
evelinehong / FixIt
Codes of CVPR2022 paper: Fixing Malfunctional Objects With Learned Physical Simulation and Functional Prediction
☆32Updated 2 years ago
joyhsu0504 / LEFT
☆42Updated last year
wllmzhu / G-VUE
General-purpose Visual Understanding Evaluation
☆20Updated last year
evelinehong / 3D-Concept-Grounding
Code Release of "3D Concept Grounding on Neural Fields (NeurIPS2022)"
☆15Updated 2 years ago
Buzz-Beater / EgoTaskQA
Code for NeurIPS 2022 Datasets and Benchmarks paper - EgoTaskQA: Understanding Human Tasks in Egocentric Videos.
☆33Updated 2 years ago
snaredataset / snare
SNARE Dataset with MATCH and LaGOR models
☆24Updated last year
UCSB-NLP-Chang / Visual-Spatial-Planning
Official release of the benchmark in paper "VSP: Assessing the dual challenges of perception and reasoning in spatial planning tasks for …
☆11Updated last year
DirtyHarryLYL / Sandwich
Bidirectional Mapping between Action Physical-Semantic Space
☆31Updated 10 months ago
LargeWorldModel / ElasticTok
ElasticTok: Adaptive Tokenization for Image and Video
☆72Updated 8 months ago
maitrix-org / SimWorld
Main repo for SimWorld simulator.
☆53Updated 3 weeks ago
QinengWang-Aiden / Awesome-embodied-world-model-papers
A paper list that includes world models or generative video models for embodied agents.
☆24Updated 5 months ago