☆26Oct 20, 2021Updated 4 years ago
Alternatives and similar repositories for D-LSG-Video-Caption
Users that are interested in D-LSG-Video-Caption are comparing it to the libraries listed below
Sorting:
- The PyTorch code of the AAAI2021 paper "Non-Autoregressive Coarse-to-Fine Video Captioning".☆57Oct 22, 2023Updated 2 years ago
- Official pytorch implementation of the AAAI 2021 paper "Semantic Grouping Network for Video Captioning"☆54Jul 9, 2021Updated 4 years ago
- IJCAI2020: Learning to Discretely Compose Reasoning Module Networks for Video Captioning☆79Nov 23, 2020Updated 5 years ago
- Source code for Semantics-Assisted Video Captioning Model Trained with Scheduled Sampling Strategy☆55Jul 31, 2021Updated 4 years ago
- ☆62May 11, 2021Updated 4 years ago
- Code for GLAT (Global Local Transformer), ECCV 2020 "Learning Visual Commonsense for Robust Scene Graph Generation"☆11Dec 16, 2020Updated 5 years ago
- A curated list of research papers in Video Captioning☆121Jan 5, 2021Updated 5 years ago
- Bottom-up Top-down image captioning model with PyTorch.☆14Dec 5, 2020Updated 5 years ago
- ☆15Aug 20, 2024Updated last year
- A curated list of Multimodal Captioning related research(including image captioning, video captioning, and text captioning)☆113Jun 6, 2022Updated 3 years ago
- Source code of the paper titled *Attentive Visual Semantic Specialized Network for Video Captioning*☆15Apr 6, 2021Updated 4 years ago
- Research code for CVPR 2022 paper "SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning"☆247May 26, 2022Updated 3 years ago
- ☆28Sep 1, 2021Updated 4 years ago
- Official Code of CVPR'23 Paper "VLPD: Context-Aware Pedestrian Detection via Vision-Language Semantic Self-Supervision"☆22Apr 21, 2024Updated last year
- [CVPR2022] Official code for Hierarchical Modular Network for Video Captioning. Our proposed HMN is implemented with PyTorch.☆50Sep 30, 2022Updated 3 years ago
- BiC-Net: Learning Efficient Spatio-Temporal Relation for Text-Video Retrieval☆26Jul 22, 2022Updated 3 years ago
- MDMMT: Multidomain Multimodal Transformer for Video Retrieval☆26Jun 28, 2021Updated 4 years ago
- VLG-Net: Video-Language Graph Matching Networks for Video Grounding☆31May 31, 2022Updated 3 years ago
- This repository focus on Image Captioning & Video Captioning & Seq-to-Seq Learning & NLP☆411Nov 14, 2022Updated 3 years ago
- Generalized cross-modal NNs; new audiovisual benchmark (IEEE TNNLS 2019)☆31Apr 13, 2020Updated 5 years ago
- Official implementation for paper Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos☆28Dec 8, 2023Updated 2 years ago
- ☆30Mar 2, 2022Updated 4 years ago
- Codebase for CVPR 2020 paper "Spatio-Temporal Graph for Video Captioning with Knowledge Distillation"☆23Mar 4, 2020Updated 6 years ago
- ☆29Oct 4, 2023Updated 2 years ago
- TCPNet☆38Dec 1, 2021Updated 4 years ago
- Code for our CVPR 2021 paper Glance and Gaze: Inferring Action-aware Points for One-Stage Human-Object Interaction Detection☆30Apr 16, 2021Updated 4 years ago
- Style Transfer by Rigid Alignment in Neural Net Feature Space☆11Jan 23, 2021Updated 5 years ago
- SGAP-Net: Semantic-Guided Attentive Prototypes Network for Few-Shot Human-Object Interaction Recognition, AAAI2020.☆14Dec 15, 2020Updated 5 years ago
- PyTorch implementation of "Detecting 32 Pedestrian Attributes for Autonomous Vehicles"☆33Oct 16, 2021Updated 4 years ago
- 新词发现/新词挖掘/自由度/凝固度/python3☆10May 28, 2019Updated 6 years ago
- Finetuning & extending DiffusionDet to video & pedestrian multi-object-tracking☆13Apr 12, 2023Updated 2 years ago
- Source code of the paper titled *Improving Video Captioning with Temporal Composition of a Visual-Syntactic Embedding*☆30Apr 16, 2021Updated 4 years ago
- ☆35Jun 6, 2023Updated 2 years ago
- A PyTorch implementation of state of the art video captioning models from 2015-2019 on MSVD and MSRVTT datasets.☆74Jul 30, 2023Updated 2 years ago
- End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021)☆229Jan 3, 2024Updated 2 years ago
- The project is intended to demonstrate Lane tracking & detection on Qualcomm’s Robotics Platform RB5. YOLOP is the architecture used to i…☆10Aug 22, 2023Updated 2 years ago
- 豆瓣电影评论可视化☆10May 19, 2016Updated 9 years ago
- I have created a dataset of Image-Text-Pairs by using the cosine similarity of the CLIP embeddings of the image & it's caption derrived f…☆16Apr 22, 2021Updated 4 years ago
- 短信验证码模块☆10Jul 25, 2021Updated 4 years ago