Sha-Lab/CMHSE

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Sha-Lab/CMHSE)

Sha-Lab / CMHSE

The code repository for "Cross-Modal and Hierarchical Modeling of Video and Text" in PyTorch

☆16

Alternatives and similar repositories for CMHSE

Users that are interested in CMHSE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zbwglory / CMHSE
View on GitHub
The code repository for "Cross-Modal and Hierarchical Modeling of Video and Text" in PyTorch
☆20Apr 26, 2020Updated 6 years ago
ikuinen / CMIN_moment_retrieval
View on GitHub
Cross-Modal Interaction Networks for Query-Based Moment Retrieval in Videos
☆87Nov 22, 2020Updated 5 years ago
m-bain / CondensedMovies-chall
View on GitHub
Condensed Movies Challenge 2021
☆22Sep 21, 2022Updated 3 years ago
LisaAnne / LocalizingMoments
View on GitHub
Github for my ICCV 2017 paper: "Localizing Moments in Video with Natural Language"
☆198Oct 31, 2020Updated 5 years ago
hassanhub / R3Transformer
View on GitHub
Official python implementation of R3-Transformer
☆15Nov 30, 2020Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
zinengtang / DeCEMBERT
View on GitHub
Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)
☆17Jan 12, 2023Updated 3 years ago
niluthpol / weak_supervised_video_moment
View on GitHub
Weakly Supervised Video Moment Retrieval from Text Queries
☆43Jul 20, 2020Updated 6 years ago
jamespark3922 / adv-inf
View on GitHub
Adversarial Inference for Multi-Sentence Video Descriptions (CVPR 2019)
☆34Jul 17, 2019Updated 7 years ago
INK-USC / VisCOLL
View on GitHub
Code and data for the project "Visually grounded continual learning of compositional semantics"
☆22Dec 27, 2022Updated 3 years ago
NJUPT-MCC / DualVGR-VideoQA
View on GitHub
Implementation for the journal paper "DualVGR: A Dual-Visual Graph Reasoning Unit for Video Question Answering" (Jianyu et al., IEEE Tran…
☆18Jun 22, 2021Updated 5 years ago
UCSC-VLAA / Image-Pretraining-for-Video
View on GitHub
[ECCV 2022] This repository includes the official implementation our paper "In Defense of Image Pre-Training for Spatiotemporal Recogniti…
☆19Dec 22, 2022Updated 3 years ago
princetonvisualai / imagecaptioning-bias
View on GitHub
Code for the paper "Understanding and Evaluating Racial Biases in Image Captioning"
☆12Mar 26, 2026Updated 3 months ago
thaolmk54 / LOGNet-VQA
View on GitHub
Implementation for the paper "Dynamic Language Binding in Relational Visual Reasoning" (Le et al., IJCAI 2020)
☆13Jul 25, 2024Updated last year
tanghaoyu258 / ACRM-for-moment-retrieval
View on GitHub
☆27Aug 16, 2022Updated 3 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
zhengyang5 / MMED400
View on GitHub
☆13Nov 19, 2024Updated last year
tgc1997 / RMN
View on GitHub
IJCAI2020: Learning to Discretely Compose Reasoning Module Networks for Video Captioning
☆79Nov 23, 2020Updated 5 years ago
ikuinen / semantic_completion_network
View on GitHub
☆26Aug 4, 2020Updated 5 years ago
interactive-cookbook / ara
View on GitHub
Corpus and code for Aligned Recipe Actions (ARA) corpus, EMNLP 2021
☆10May 22, 2024Updated 2 years ago
mayu-ot / rethinking-evs
View on GitHub
Scripts of our CVPR'19 paper "Rethinking the Evaluation of Video Summaries"
☆68Aug 24, 2021Updated 4 years ago
CPJKU / cca_layer
View on GitHub
Implementation of Canonical Correlation Analysis Layer for Cross-Modality Retrieval.
☆31Mar 8, 2018Updated 8 years ago
Wangt-CN / VQG-GCN
View on GitHub
A GCN based visual question generation model
☆13Aug 21, 2019Updated 6 years ago
CrossmodalGroup / GSMN
View on GitHub
Implementation of our CVPR2020 paper, Graph Structured Network for Image-Text Matching
☆170Oct 12, 2020Updated 5 years ago
StanfordVL / STGraph
View on GitHub
Codebase for CVPR 2020 paper "Spatio-Temporal Graph for Video Captioning with Knowledge Distillation"
☆23Mar 4, 2020Updated 6 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
airxiechao / symptom-text-recognition
View on GitHub
从病例文本中识别描述症状的文字
☆10Dec 31, 2021Updated 4 years ago
VALUE-Leaderboard / DataRelease
View on GitHub
Data Release for VALUE Benchmark
☆30Feb 16, 2022Updated 4 years ago
PKU-ICST-MIPL / MGAH_TMM2019
View on GitHub
Source code of our TMM 2019 paper "Multi-pathway Generative Adversarial Hashing for Unsupervised Cross-modal Retrieval"
☆12Jun 17, 2019Updated 7 years ago
GingL / ARN
View on GitHub
Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding
☆33Aug 29, 2019Updated 6 years ago
rGitcy / TCM-Data-Mining_papers
View on GitHub
☆11Jan 9, 2019Updated 7 years ago
tomekkorbak / treehopper
View on GitHub
A Tree-LSTM-based dependency tree sentiment labeler
☆15May 9, 2019Updated 7 years ago
HouJP / my-mllib
View on GitHub
The project implemented some machine learning algorithms on spark which is written in scala and it also included standalone implementatio…
☆16Jan 3, 2022Updated 4 years ago
devraj89 / Generalized-Semantic-Preserving-Hashing-for-N-Label-Cross-Modal-Retrieval
View on GitHub
This is the implementation for the paper "Generalized Semantic Preserving Hashing for N-Label Cross-Modal Retrieval"
☆14Dec 7, 2017Updated 8 years ago
ranqiu92 / NumNet
View on GitHub
☆33Jul 25, 2024Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
niluthpol / multimodal_vtt
View on GitHub
Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval
☆68Apr 10, 2020Updated 6 years ago
simon-ging / coot-videotext
View on GitHub
COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning
☆291Sep 6, 2022Updated 3 years ago
SSahuDS / Lipreading-Using-Mutimodal-Speech-Recognition
View on GitHub
Multimodal Speech Recognition for phoneme level prediction using Audio-Visual data from TCDTIMIT dataset implementing RNNs with LSTMs for…
☆15Jul 27, 2023Updated 2 years ago
hyounghk / VideoQADenseCapFrameGate-ACL2020
View on GitHub
Code for ACL 2020 paper "Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA." Hyounghun Kim, Zineng T…
☆34May 14, 2020Updated 6 years ago
Alvin-Zeng / PGCN
View on GitHub
Graph Convolutional Networks for Temporal Action Localization (ICCV2019)
☆322Jul 4, 2020Updated 6 years ago
jshi31 / NAFAE
View on GitHub
Implementation of paper "Not All Frames Are Equal: Weakly-Supervised Video Grounding with Contextual Similarity and Visual Clustering Los…
☆30Jun 29, 2020Updated 6 years ago
iworldtong / Awesome-Temporal-Sentence-Grounding-in-Videos
View on GitHub
A curated list of grounding natural language in video and related area. :-)
☆82Dec 16, 2019Updated 6 years ago