LeeYN-43/Clover

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/LeeYN-43/Clover)

LeeYN-43 / Clover

Offical PyTorch implementation of Clover: Towards A Unified Video-Language Alignment and Fusion Model (CVPR2023)

☆39

Alternatives and similar repositories for Clover

Users that are interested in Clover are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

luo3300612 / Transformer-Captioning
View on GitHub
Optimized code based on M2 for faster image captioning training
☆21Nov 18, 2022Updated 3 years ago
yonatanbitton / data_efficient_masked_language_modeling_for_vision_and_language
View on GitHub
Repository for the paper "Data Efficient Masked Language Modeling for Vision and Language".
☆18Sep 17, 2021Updated 4 years ago
luo3300612 / Semantics-AssistedVideoCaptioning.pytorch
View on GitHub
pytorch implementation of Semantics-AssistedVideoCaptioning
☆11Feb 16, 2023Updated 3 years ago
luogen1996 / LWTransformer
View on GitHub
Lightweight Transformer for Multi-modal Tasks
☆16Dec 9, 2022Updated 3 years ago
microsoft / LAVENDER
View on GitHub
A Unified Framework for Video-Language Understanding
☆62Jun 17, 2023Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
LgQu / TIGeR
View on GitHub
Code for paper: Unified Text-to-Image Generation and Retrieval
☆16Updated this week
salesforce / ALPRO
View on GitHub
Align and Prompt: Video-and-Language Pre-training with Entity Prompts
☆188May 1, 2025Updated last year
nini0919 / SemiRES
View on GitHub
[ICML2024]The official implementation of SemiRES in PyTorch.
☆33Jun 20, 2024Updated 2 years ago
xmu-xiaoma666 / ImageCaptionMetrics
View on GitHub
This repository contains 2 tools: - A py3 Lib for NLP & image-caption metrics - Code for a two-tailed t-test with paired samples. It wil…
☆18Apr 4, 2021Updated 5 years ago
zhangxuying1004 / RSTNet
View on GitHub
Official Code for 'RSTNet: Captioning with Adaptive Attention on Visual and Non-Visual Words' (CVPR 2021)
☆123Dec 17, 2022Updated 3 years ago
rentainhe / TRAR-VQA
View on GitHub
[ICCV 2021] Official implementation of the paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"
☆68Oct 11, 2021Updated 4 years ago
gicheonkang / sglkt-visdial
View on GitHub
🌈 PyTorch Implementation for EMNLP'21 Findings "Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer"
☆13Feb 1, 2023Updated 3 years ago
nini0919 / DiffPNG
View on GitHub
[ECCV2024]The official implementation of the DiffPNG paper in PyTorch.
☆17Oct 17, 2024Updated last year
TencentYoutuResearch / PersonRetrieval-IVT
View on GitHub
Code for ECCV 2022 Workshop paper "See Finer, See More: Implicit Modality Alignment for Text-based Person Retrieval"
☆23Nov 16, 2025Updated 8 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
seanzhuh / SeqTR
View on GitHub
SeqTR: A Simple yet Universal Network for Visual Grounding
☆144Oct 30, 2024Updated last year
lustrouselixir / SANet
View on GitHub
Structure Aggregation for Cross-Spectral Stereo Image Guided Denoising
☆15Mar 18, 2023Updated 3 years ago
YouHuang67 / mamba-code-explained
View on GitHub
☆19Jan 7, 2026Updated 6 months ago
mrwu-mac / DIFNet
View on GitHub
[CVPR 2022] This repository is for the paper ``DIFNet: Boosting Visual Information Flow for Image Captioning'' .
☆21Nov 28, 2022Updated 3 years ago
linjieli222 / HERO
View on GitHub
Research code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"
☆235Sep 16, 2021Updated 4 years ago
CharlieDDDD / AISurveyPapers
View on GitHub
Large Visual Language Model(LVLM), Large Language Model(LLM), Multimodal Large Language Model(MLLM), Alignment, Agent, AI System, Survey
☆21Jul 27, 2025Updated 11 months ago
gicheonkang / gst-visdial
View on GitHub
Official PyTorch Implementation for CVPR'23 Paper, "The Dialog Must Go On: Improving Visual Dialog via Generative Self-Training"
☆20Dec 11, 2023Updated 2 years ago
wenz116 / DRFT
View on GitHub
End-to-end Multi-modal Video Temporal Grounding, NeurIPS 2021
☆18Oct 24, 2021Updated 4 years ago
luo3300612 / image-captioning-DLCT
View on GitHub
Official pytorch implementation of paper "Dual-Level Collaborative Transformer for Image Captioning" (AAAI 2021).
☆203Jun 8, 2022Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Mi-Peng / Sparse-Sharpness-Aware-Minimization
View on GitHub
[NeurIPS 2022] Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach -- Official Implementation
☆48Jun 29, 2023Updated 3 years ago
ypwang61 / StoryEval
View on GitHub
[CVPR2025] Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation
☆21May 2, 2025Updated last year
yuxiaodongHRI / SOIT
View on GitHub
SOIT: Segmenting Objects with Instance-Aware Transformers
☆14Jun 6, 2022Updated 4 years ago
PKU-ICST-MIPL / MAI_ICLR2025
View on GitHub
☆20Mar 5, 2025Updated last year
showlab / DemoVLP
View on GitHub
[Arxiv2022] Revitalize Region Feature for Democratizing Video-Language Pre-training
☆22Mar 19, 2022Updated 4 years ago
showlab / all-in-one
View on GitHub
[CVPR2023] All in One: Exploring Unified Video-Language Pre-training
☆281Mar 25, 2023Updated 3 years ago
weixi-feng / TC-Bench
View on GitHub
☆27Jun 22, 2024Updated 2 years ago
foolwood / DRL
View on GitHub
[arXiv22] Disentangled Representation Learning for Text-Video Retrieval
☆96Apr 7, 2022Updated 4 years ago
rentainhe / pytorch-distributed-training
View on GitHub
Simple tutorials on Pytorch DDP training
☆279Aug 19, 2022Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
phelps-matthew / FeatherMap
View on GitHub
Implementation of "Structured Multi-Hashing for Model Compression" (CVPR 2020)
☆12Feb 18, 2021Updated 5 years ago
sauradip / TAGS
View on GitHub
[ECCV 2022] Official Pytorch Implementation of paper : " Proposal-Free Temporal Action Detection with Global Segmentation Mask Learning "…
☆18Oct 19, 2022Updated 3 years ago
xuewyang / Fashion_Captioning
View on GitHub
ECCV2020 paper: Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code and Data.
☆85Jun 22, 2023Updated 3 years ago
BCV-Uniandes / PNG
View on GitHub
☆61Oct 23, 2021Updated 4 years ago
jpthu17 / HBI
View on GitHub
[CVPR 2023 Highlight & TPAMI] Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning
☆125Dec 28, 2024Updated last year
zmykevin / UVLP
View on GitHub
CVPR 2022 (Oral) Pytorch Code for Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment
☆21Apr 15, 2022Updated 4 years ago
xmu-xiaoma666 / X-Mesh
View on GitHub
A pytorch implementation of “ X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance”
☆29Jan 12, 2024Updated 2 years ago