Video-Language Alignment via Spatio–Temporal Graph Transformer; ArXiv: https://arxiv.org/abs/2407.11677
☆14Jul 24, 2024Updated last year
Alternatives and similar repositories for STGT
Users that are interested in STGT are comparing it to the libraries listed below
Sorting:
- Code implementation of paper "MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval (AAAI2025)"☆25Feb 2, 2025Updated last year
- Pytorch Code for "Unified Coarse-to-Fine Alignment for Video-Text Retrieval" (ICCV 2023)☆66Jun 7, 2024Updated last year
- This repository contains the code associated with our 2023 TMI paper "Latent Graph Representations for Critical View of Safety Assessment…☆35Sep 17, 2025Updated 5 months ago
- A vision-language model with bidirectional progressive fusion and global-local alignment for enhanced medical image segmentation.☆17Dec 25, 2025Updated 2 months ago
- The implementation codes of paper: Multimodal Sentiment Analysis with Mutual Information-based Disentangled Representation Learning☆18May 8, 2025Updated 9 months ago
- ☆14Aug 28, 2024Updated last year
- [ACM MM2024] The code for HMLLM.☆11Oct 27, 2024Updated last year
- 在监控画质下实现对校园自行车的重识别,包含REID模型识别,向量数据库检索,UI展示☆10Feb 13, 2024Updated 2 years ago
- Hypergraph Vision Transformers: Images are More than Nodes, More than Edges☆17Jul 25, 2025Updated 7 months ago
- [IJCAI-24] Explore Internal and External Similarity for Single Image Deraining with Graph Neural Networks☆10Sep 2, 2024Updated last year
- Using machine learning techniques for prediction and modelling non linear dynamic systems.☆10Jun 29, 2018Updated 7 years ago
- The official pytorch implemention of our IJCV-2025 paper "Learning with Enriched Inductive Biases for Vision-Language Models".☆14Mar 26, 2025Updated 11 months ago
- Official implementation of the paper "M3CoTBench: Benchmark Chain-of-Thought of MLLMs in Medical Image Understanding"☆21Jan 14, 2026Updated last month
- A matlab package for analyzing chaotic properties of time series data☆11Jun 29, 2018Updated 7 years ago
- 【ICME2025 Oral】Offical Pytorch Code for "Fraesormer: Learning Adaptive Sparse Transformer for Efficient Food Recognition"☆11Mar 21, 2025Updated 11 months ago
- Official codebase for FACMIC: Federated Adaptative CLIP Model for Medical Image Classification (Accepted at MICCAI 2024)☆14Jun 21, 2024Updated last year
- LongAttn :Selecting Long-context Training Data via Token-level Attention☆15Jul 16, 2025Updated 7 months ago
- ☆14Jul 8, 2024Updated last year
- ☆12Dec 17, 2024Updated last year
- Official Implementation of the topograph method for topology-preserving image segmentation.☆22Oct 2, 2024Updated last year
- This github contains the implementation of the method proposed in MDGNN_BS paper☆12May 9, 2024Updated last year
- EfficientSAM + YOLO World base model for use with Autodistill.☆10Feb 21, 2024Updated 2 years ago
- ☆23Jun 12, 2025Updated 8 months ago
- Community-aware Graph Transformer (CGT) is a novel Graph Transformer model that utilizes community structures to address node degree bias…☆15Aug 27, 2025Updated 6 months ago
- 项目描述:项目主要是在 GEC6818 开发板上实现一个综合娱乐系统,包括消灭星星,电子钢琴,2048 游戏,mp4等功能,分为游戏客户端和游戏服务端,游戏客户端具体实现 通过 vector 容器存放游戏棋盘,通过棋盘存放的数据将对应数字的 BMP 图片打印到 GEC681…☆10Feb 13, 2022Updated 4 years ago
- A project using YoloV8 to detect License Plates☆12Sep 29, 2023Updated 2 years ago
- ☆12Oct 30, 2024Updated last year
- EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE☆10Mar 1, 2024Updated 2 years ago
- The code of paper "O-Mamba: O-shape State-Space Model for Underwater Image Enhancement"☆13Oct 18, 2024Updated last year
- Multiple Attractors simulation with customization☆14Feb 22, 2026Updated last week
- ☆13Jul 6, 2024Updated last year
- A multimodal model bridging vision and genomics for biodiversity monitoring at scale.☆16Sep 18, 2025Updated 5 months ago
- ☆12Oct 30, 2025Updated 4 months ago
- [MICCAI 2025] FEAT:Full-Dimensional Efficient Attention Transformer for Medical Video Generation.☆21Sep 24, 2025Updated 5 months ago
- Official codebase for the WACV 2023 paper: Scaling Novel Object Detection with Weakly Supervised Detection Transformers. https://arxiv.or…☆13Mar 18, 2024Updated last year
- Fine-Grained Pixel-Text Alignment for Open-Vocabulary Semantic Segmentation☆15Sep 24, 2025Updated 5 months ago
- Graph in Graph Neural Network (https://arxiv.org/abs/2407.00696)☆15Sep 12, 2024Updated last year
- The official PyTorch implementation for 2024-ICASSP-Adaptive Spatial-Temporal Hypergraph Fusion Learning for Next POI Recommendation☆12Sep 8, 2024Updated last year
- [MICCAI'24] Incorporating Clinical Guidelines through Adapting Multi-modal Large Language Model for Prostate Cancer PI-RADS Scoring☆12Aug 2, 2024Updated last year