tsujuifu / pytorch_empirical-mvmView external linksLinks
A PyTorch implementation of EmpiricalMVM
☆41Dec 18, 2023Updated 2 years ago
Alternatives and similar repositories for pytorch_empirical-mvm
Users that are interested in pytorch_empirical-mvm are comparing it to the libraries listed below
Sorting:
- A PyTorch implementation of TVC☆24Dec 18, 2023Updated 2 years ago
- A PyTorch implementation of VIOLET☆140Dec 17, 2023Updated 2 years ago
- ORES: Open-vocabulary Responsible Visual Synthesis☆14Dec 12, 2023Updated 2 years ago
- A PyTorch implementation of BCO☆12Jun 19, 2023Updated 2 years ago
- Code for DVD A Diagnostic Dataset for Multi-step Reasoning in Video Grounded Dialogue☆14Oct 12, 2021Updated 4 years ago
- A Unified Framework for Video-Language Understanding☆61Jun 17, 2023Updated 2 years ago
- Research code for CVPR 2022 paper "SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning"☆247May 26, 2022Updated 3 years ago
- Code for ACL 2020 paper "Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA." Hyounghun Kim, Zineng T…☆34May 14, 2020Updated 5 years ago
- Code for the paper "A Divide-and-Conquer Approach to the Summarization of Long Documents"☆18Jun 8, 2021Updated 4 years ago
- Research code for CVPR 2021 paper "End-to-End Human Pose and Mesh Reconstruction with Transformers"☆17May 22, 2021Updated 4 years ago
- CVPR 2022 (Oral) Pytorch Code for Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment☆22Apr 15, 2022Updated 3 years ago
- An alternative EQA paradigm and informative benchmark + models (BMVC 2019, ViGIL 2019 spotlight)☆25Jun 22, 2022Updated 3 years ago
- ☆25Mar 4, 2022Updated 3 years ago
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆13Jun 28, 2025Updated 7 months ago
- A PyTorch implementation of LDAST☆26Dec 17, 2023Updated 2 years ago
- MDMMT: Multidomain Multimodal Transformer for Video Retrieval☆26Jun 28, 2021Updated 4 years ago
- Align and Prompt: Video-and-Language Pre-training with Entity Prompts☆188May 1, 2025Updated 9 months ago
- Codebase for the paper "How Crucial is Transformer in Decision Transformer?". Containing experiments on different pendulum tasks and code…☆28Mar 24, 2023Updated 2 years ago
- ROCK model for Knowledge-Based VQA in Videos☆31Oct 19, 2020Updated 5 years ago
- ☆23Apr 8, 2024Updated last year
- Learning with Noisy Labels by adopting a peer prediction loss function.☆35Mar 3, 2020Updated 5 years ago
- Data Release for VALUE Benchmark☆30Feb 16, 2022Updated 4 years ago
- ☆73Jun 3, 2022Updated 3 years ago
- ☆33Nov 12, 2018Updated 7 years ago
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated last year
- GIT: A Generative Image-to-text Transformer for Vision and Language☆580Dec 2, 2023Updated 2 years ago
- MERLOT: Multimodal Neural Script Knowledge Models☆225Mar 15, 2022Updated 3 years ago
- Repository of our paper 'Refer-it-in-RGBD' in CVPR 2021☆43May 24, 2024Updated last year
- Repository for Multilingual-VQA task created during HuggingFace JAX/Flax community week.☆34Jul 27, 2021Updated 4 years ago
- Code for paper, "TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency" ECCV 2022☆39Feb 17, 2023Updated 3 years ago
- Patching open-vocabulary models by interpolating weights☆91Sep 28, 2023Updated 2 years ago
- Deep learning introduction to beginners with PyTorch☆12Apr 24, 2020Updated 5 years ago
- This is a data repository for the ACL 2020 paper: "Let Me Choose: From Verbal Context to Font Selection"☆10May 5, 2020Updated 5 years ago
- Voltron Evaluation: Diverse Evaluation Tasks for Robotic Representation Learning☆37Jul 9, 2023Updated 2 years ago
- ECG analysis to classify anterior myocardial infarction cases.☆10May 17, 2017Updated 8 years ago
- A Google Chrome Extension that replaces the official New Tab page with a beautiful to-do list.☆12Mar 7, 2018Updated 7 years ago
- Feature Extraction Toolbox from CUHKÐZ&SIAT submission to ActivityNet 2016☆32Mar 31, 2019Updated 6 years ago
- Beyond Entities: A Large-Scale Multi-Modal Knowledge Graph with Triplet Fact Grounding☆11May 23, 2024Updated last year
- ☆11May 24, 2024Updated last year