aimagelab / perceive-transform-and-actView external linksLinks
PyTorch code for the paper: "Perceive, Transform, and Act: Multi-Modal Attention Networks for Vision-and-Language Navigation"
☆19Aug 5, 2021Updated 4 years ago
Alternatives and similar repositories for perceive-transform-and-act
Users that are interested in perceive-transform-and-act are comparing it to the libraries listed below
Sorting:
- PyTorch code for BMVC 2019 paper: Embodied Vision-and-Language Navigation with Dynamic Convolutional Filters☆20Jan 4, 2023Updated 3 years ago
- The repository of ECCV 2020 paper `Active Visual Information Gathering for Vision-Language Navigation`☆44Apr 9, 2022Updated 3 years ago
- TopViewRS: Vision-Language Models as Top-View Spatial Reasoners (EMNLP 2024 Oral)☆15Jun 14, 2025Updated 7 months ago
- Code for "Counterfactual Variable Control for Robust and Interpretable Question Answering"☆14Oct 13, 2020Updated 5 years ago
- large scale pretrain for navigation task☆94Mar 2, 2023Updated 2 years ago
- SelfCriticalSequenceTrainingforImageCaptioning☆21May 27, 2017Updated 8 years ago
- Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"☆88Jun 27, 2024Updated last year
- PyTorch code for ICLR 2019 paper: Self-Monitoring Navigation Agent via Auxiliary Progress Estimation☆122Oct 3, 2023Updated 2 years ago
- REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments☆148Feb 7, 2026Updated last week
- Code and data of the Fine-Grained R2R Dataset proposed in the EMNLP 2021 paper Sub-Instruction Aware Vision-and-Language Navigation☆56Oct 26, 2021Updated 4 years ago
- A simple but well-performing "single-hop" visual attention model for the GQA dataset☆20Aug 8, 2019Updated 6 years ago
- Code for the paper "Improving Vision-and-Language Navigation with Image-Text Pairs from the Web" (ECCV 2020)☆59Oct 7, 2022Updated 3 years ago
- Official Pytorch implementation for NeurIPS 2022 paper "Weakly-Supervised Multi-Granularity Map Learning for Vision-and-Language Navigati…☆33Apr 23, 2023Updated 2 years ago
- Repository for the paper: Teaching VLMs to Localize Specific Objects from In-context Examples☆40Nov 27, 2024Updated last year
- Bottom-up features extractor implemented in PyTorch.☆72Dec 5, 2019Updated 6 years ago
- Pytorch Code and Data for EnvEdit: Environment Editing for Vision-and-Language Navigation (CVPR 2022)☆30Aug 2, 2022Updated 3 years ago
- Tracking Of Agent (actions and belief) and Spatio-TEmporal Reasoning☆14Feb 7, 2020Updated 6 years ago
- A modified version of the SfM pipeline given in the link. The modifications are for a certain heuristics to reconstruct 3D from ambiguous…☆10Nov 29, 2017Updated 8 years ago
- This is the repo for Multi-level textual grounding☆34Jul 21, 2020Updated 5 years ago
- Notebooks for time-series forecasting (SARIMA)☆10Apr 22, 2019Updated 6 years ago
- The implementation of Learning Instance and Task-Aware Dynamic Kernels for Few Shot Learning☆13Apr 14, 2024Updated last year
- ☆12May 26, 2022Updated 3 years ago
- This is the official repository for the paper "Modeling Human Gaze Behavior with Diffusion Models for Unified Scanpath Prediction". ICCV …☆23Dec 4, 2025Updated 2 months ago
- IJCAI2020: Learning to Discretely Compose Reasoning Module Networks for Video Captioning☆79Nov 23, 2020Updated 5 years ago
- Implementation for our paper "Conditional Image-Text Embedding Networks"☆39Mar 19, 2020Updated 5 years ago
- Scene Graph Parsing as Dependency Parsing☆41May 22, 2019Updated 6 years ago
- Applications of Model-Structured Neural Networks using nnodely☆15Feb 2, 2026Updated last week
- Generate images of Chinese license plates☆11Feb 8, 2021Updated 5 years ago
- OpenAI ROS☆12Mar 7, 2019Updated 6 years ago
- PyTorch reimplementation of Noise2Same with enhancements☆11Nov 9, 2024Updated last year
- [ICRA 2020] Residual Reactive Navigation: Combining Classical and Learned Navigation Strategies for Deployment in Unknown Environments☆12Jan 23, 2020Updated 6 years ago
- Towards Target-Driven Visual Navigation in Indoor Scenes via Generative Imitation Learning☆12Dec 20, 2020Updated 5 years ago
- Q&A dataset for many-shot jailbreaking☆14Jul 19, 2024Updated last year
- [KDD24-ADS] R-Eval: A Unified Toolkit for Evaluating Domain Knowledge of Retrieval Augmented Large Language Models☆11Apr 9, 2024Updated last year
- ☆15Apr 11, 2023Updated 2 years ago
- Record my learning progress.☆10Mar 1, 2022Updated 3 years ago
- Transform your selfie into an anime image☆12Dec 28, 2020Updated 5 years ago
- ☆13Apr 3, 2024Updated last year
- Components for image segmentation☆13Dec 9, 2023Updated 2 years ago