Official PyTorch Implementation for CVPR'23 Paper, "The Dialog Must Go On: Improving Visual Dialog via Generative Self-Training"
☆20Dec 11, 2023Updated 2 years ago
Alternatives and similar repositories for gst-visdial
Users that are interested in gst-visdial are comparing it to the libraries listed below
Sorting:
- SelecMix: Debiased Learning by Contradicting-pair Sampling (NeurIPS 2022)☆13Jun 5, 2024Updated last year
- Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning (ICML 2024)☆19Jun 5, 2024Updated last year
- ✨ Official PyTorch Implementation for EMNLP'19 Paper, "Dual Attention Networks for Visual Reference Resolution in Visual Dialog"☆45Mar 19, 2023Updated 2 years ago
- [WACV 2025] Official Pytorch code for "Background-aware Moment Detection for Video Moment Retrieval"☆16Feb 24, 2025Updated last year
- Source code for paper "VD-PCR: Improving Visual Dialog with Pronoun Coreference Resolution"☆10Nov 1, 2022Updated 3 years ago
- Implementation for the paper "Unified Multimodal Model with Unlikelihood Training for Visual Dialog"☆13May 12, 2023Updated 2 years ago
- 🦾 PyTorch Implementation for the ICRA'24 Paper, "PROGrasp: Pragmatic Human-Robot Communication for Object Grasping"☆15May 5, 2025Updated 10 months ago
- A companion for the Causal Artificial Intelligence book.☆15Sep 24, 2025Updated 5 months ago
- Decision Transformer JAX - Reproduction of 'Decision Transformer: Reinforcement Learning via Sequence Modeling' in JAX and Haiku☆13Aug 14, 2024Updated last year
- ESPER☆24Mar 29, 2024Updated last year
- ☆22Nov 11, 2023Updated 2 years ago
- Recent Advances in Visual Dialog☆30Aug 19, 2022Updated 3 years ago
- Awesome Vision-Language Compositionality, a comprehensive curation of research papers in literature.☆35Feb 13, 2025Updated last year
- Dataset and Source code for EMNLP 2019 paper "What You See is What You Get: Visual Pronoun Coreference Resolution in Dialogues"☆26Sep 10, 2021Updated 4 years ago
- This repository contains code used in our ACL'20 paper History for Visual Dialog: Do we really need it?☆34Mar 24, 2023Updated 2 years ago
- ☆30Dec 16, 2022Updated 3 years ago
- This is the official implementation of NeurIPS 2022 paper "Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal R…☆35Jan 25, 2023Updated 3 years ago
- This repository is the official implementation of Topology-Informed Graph Transformer (Choi et al., GRaM Workshop at ICML 2024).☆12Dec 28, 2024Updated last year
- Belief Revision based Caption Re-ranker with Visual Semantic Information. COLING 2022☆11Apr 13, 2025Updated 10 months ago
- Partially Non-Autoregressive Image Captioning☆10Sep 30, 2021Updated 4 years ago
- Generalization in Metric Learning: Should the Embedding Layer be the Embedding Layer?☆11Jan 3, 2019Updated 7 years ago
- Segment Anything with Webcam in Real-Time with FastSAM☆10Nov 19, 2023Updated 2 years ago
- Training and testing code from our CVPR 2023 paper "Are Deep Neural Networks SMARTer than Second Graders?"☆11Aug 10, 2023Updated 2 years ago
- [KDD Explore'24]Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities☆17May 7, 2025Updated 10 months ago
- ☆10Nov 12, 2024Updated last year
- Offical PyTorch implementation of Clover: Towards A Unified Video-Language Alignment and Fusion Model (CVPR2023)☆40Feb 15, 2023Updated 3 years ago
- An official codebase for paper " CHAMPAGNE: Learning Real-world Conversation from Large-Scale Web Videos (ICCV 23)"☆52Aug 13, 2023Updated 2 years ago
- Surrogate Modeling of the Aerodynamic Performance for Transonic Regime☆13Feb 12, 2024Updated 2 years ago
- A Simple Framwork for CV Pre-training Model (SOCO, VirTex, BEiT)☆15Oct 18, 2021Updated 4 years ago
- Improving word mover’s distance by leveraging self-attention matrix (Published in EMNLP 2023 Findings)☆10Jun 17, 2025Updated 8 months ago
- ☆11Sep 16, 2024Updated last year
- ☆10Feb 12, 2024Updated 2 years ago
- implementation for Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering☆10Mar 17, 2022Updated 3 years ago
- [Paperlist] Awesome paper list of multimodal dialog, including methods, datasets and metrics☆37Jan 22, 2025Updated last year
- Phys4DGen: A Physics-Driven Framework for Controllable and Efficient 4D Content Generation from a Single Image☆12May 10, 2025Updated 9 months ago
- [ICLR 2025] ELICIT: LLM Augmentation Via External In-context Capability☆13Mar 11, 2025Updated 11 months ago
- Source code of our paper "Focus on the Target’s Vocabulary: Masked Label Smoothing for Machine Translation" @ ACL 2022☆13Apr 13, 2022Updated 3 years ago
- ☆13Oct 25, 2019Updated 6 years ago
- ☆12Jun 7, 2023Updated 2 years ago