xmu-xiaoma666/LSTNet

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/xmu-xiaoma666/LSTNet)

xmu-xiaoma666 / LSTNet

Towards Local Visual Modeling for Image Captioning

☆30

Alternatives and similar repositories for LSTNet

Users that are interested in LSTNet are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

weimingboya / DFT
View on GitHub
☆13Jun 2, 2023Updated 3 years ago
xmu-xiaoma666 / SDATR
View on GitHub
Official Code for "Knowing what it is: Semantic-enhanced Dual Attention Transformer" (TMM2022)
☆19Oct 15, 2022Updated 3 years ago
zhangxuying1004 / RSTNet
View on GitHub
Official Code for 'RSTNet: Captioning with Adaptive Attention on Visual and Non-Visual Words' (CVPR 2021)
☆123Dec 17, 2022Updated 3 years ago
luo3300612 / Transformer-Captioning
View on GitHub
Optimized code based on M2 for faster image captioning training
☆21Nov 18, 2022Updated 3 years ago
zchoi / S2-Transformer
View on GitHub
[IJCAI 2022] Official Pytorch code for paper “S2 Transformer for Image Captioning”
☆86Aug 14, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
aimagelab / PMA-Net
View on GitHub
[ICCV 2023] With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning.
☆19Jun 7, 2024Updated 2 years ago
mrwu-mac / DIFNet
View on GitHub
[CVPR 2022] This repository is for the paper ``DIFNet: Boosting Visual Information Flow for Image Captioning'' .
☆21Nov 28, 2022Updated 3 years ago
GT-RIPL / Xmodal-Ctx
View on GitHub
Official PyTorch implementation of our CVPR 2022 paper: Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for …
☆61Oct 21, 2022Updated 3 years ago
luo3300612 / image-captioning-DLCT
View on GitHub
Official pytorch implementation of paper "Dual-Level Collaborative Transformer for Image Captioning" (AAAI 2021).
☆203Jun 8, 2022Updated 4 years ago
buxiangzhiren / DDCap
View on GitHub
☆85Dec 4, 2022Updated 3 years ago
jchenghu / ExpansionNet_v2
View on GitHub
Implementation code of the work "Exploiting Multiple Sequence Lengths in Fast End to End Training for Image Captioning"
☆96Dec 25, 2024Updated last year
michelecafagna26 / vinvl-visualbackbone
View on GitHub
Original VinVL visual backbone with simplified APIs to easily extract features, boxes, object detections, in a few lines of Python code.
☆12Nov 27, 2022Updated 3 years ago
omar-mohamed / Chest-X-Ray-Tags-Classification
View on GitHub
This is the implementation of the visual model mentioned in our paper 'Automated Radiology Report Generation using Conditioned Transforme…
☆10Jul 25, 2024Updated 2 years ago
jiahuei / sparse-image-captioning
View on GitHub
Image captioning with weight pruning in PyTorch
☆22Jan 14, 2022Updated 4 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
uzh-dqbm-cmi / ARGON
View on GitHub
Progressive Transformer-Based Generation of Radiology Reports
☆25Jan 5, 2025Updated last year
XuMengyaAmy / ReportDALS
View on GitHub
☆16Nov 19, 2020Updated 5 years ago
aimagelab / meshed-memory-transformer
View on GitHub
Meshed-Memory Transformer for Image Captioning. CVPR 2020
☆546Dec 21, 2022Updated 3 years ago
ezeli / Transformer_model
View on GitHub
A pytorch implementation of Attention Is All You Need (Transformer) for image captioning.
☆12Nov 15, 2021Updated 4 years ago
smileslabsh / Generative-Label-Fused-Network
View on GitHub
Generative label fused network for image–text matching
☆10Jan 13, 2023Updated 3 years ago
ZhangXu0963 / VSL
View on GitHub
The code of "Image-text Retrieval via Preserving Main Semantic of Vision" in ICME 2023.
☆15Dec 25, 2023Updated 2 years ago
jacobswan1 / ViTCAP
View on GitHub
Implementation for CVPR 2022 paper " Injecting Semantic Concepts into End-to-End Image Captionin".
☆43May 28, 2022Updated 4 years ago
YuanEZhou / CBTrans
View on GitHub
☆24Apr 4, 2022Updated 4 years ago
232525 / PureT
View on GitHub
Implementation of 'End-to-End Transformer Based Model for Image Captioning' [AAAI 2022]
☆70Jun 1, 2024Updated 2 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
SjokerLily / awesome-image-captioning
View on GitHub
A paper list of image captioning.
☆21Apr 23, 2022Updated 4 years ago
HA-Transformer / MAT
View on GitHub
The implementation of multi-branch attentive Transformer (MAT).
☆33Aug 27, 2020Updated 5 years ago
MIS-DevWorks / FBR
View on GitHub
This repository contains the official code for "Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention, Alignm…
☆11Oct 9, 2024Updated last year
LiDlab / TransDSI
View on GitHub
A deep learning framework for deubiquitnase-substrate interaction identification
☆13Jul 31, 2025Updated 11 months ago
fawazsammani / show-edit-tell
View on GitHub
Show, Edit and Tell: A Framework for Editing Image Captions, CVPR 2020
☆82Jul 17, 2020Updated 6 years ago
aimagelab / camel
View on GitHub
CaMEL: Mean Teacher Learning for Image Captioning. ICPR 2022
☆30Dec 1, 2022Updated 3 years ago
eyob12 / Deep_infrared_and_visible_image_fusion
View on GitHub
(Neurocomputing) A Deep Learning and Image Enhancement Based Pipeline for Infrared and Visible Image Fusion
☆19Mar 14, 2024Updated 2 years ago
Arun-George-Zachariah / awesome-video-retrieval-papers
View on GitHub
List of resources for video retrieval.
☆20Mar 17, 2022Updated 4 years ago
Control-xl / Medical-Vision-Langauge-Transformer
View on GitHub
☆17Nov 1, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
tany0699 / FMViT
View on GitHub
☆31Aug 3, 2023Updated 2 years ago
kw717 / SGFNet
View on GitHub
☆17Feb 20, 2024Updated 2 years ago
MinjieWan / WTAPNet
View on GitHub
☆12Nov 11, 2024Updated last year
alexhe101 / WINet
View on GitHub
Official implementation of "Pan-Sharpening With Wavelet-Enhanced High-Frequency Information"
☆12Mar 28, 2024Updated 2 years ago
matanr / capex
View on GitHub
CAPE using text-graphs
☆29Apr 7, 2025Updated last year
AWCXV / FusionBooster
View on GitHub
(2025' IJCV) This is the offical implementation for the paper titled "FusionBooster: A Unified Image Fusion Boosting Paradigm".
☆15Jul 23, 2025Updated last year
DCGM / SoftCTC
View on GitHub
This repository contains source codes for SoftCTC. Original paper can be found here: https://arxiv.org/abs/2212.02135
☆19Mar 7, 2023Updated 3 years ago