xmu-xiaoma666 / LSTNetView external linksLinks
Towards Local Visual Modeling for Image Captioning
☆29Mar 31, 2023Updated 2 years ago
Alternatives and similar repositories for LSTNet
Users that are interested in LSTNet are comparing it to the libraries listed below
Sorting:
- Official Code for "Knowing what it is: Semantic-enhanced Dual Attention Transformer" (TMM2022)☆19Oct 15, 2022Updated 3 years ago
- ☆13Jun 2, 2023Updated 2 years ago
- Official Code for 'RSTNet: Captioning with Adaptive Attention on Visual and Non-Visual Words' (CVPR 2021)☆123Dec 17, 2022Updated 3 years ago
- [IJCAI 2022] Official Pytorch code for paper “S2 Transformer for Image Captioning”☆87Aug 14, 2024Updated last year
- Official PyTorch implementation of `[ACMMM 2023]Relational Contrastive Learning for Scene Text Recognition`☆17Sep 22, 2023Updated 2 years ago
- TextAdaIN: Paying Attention to Shortcut Learning in Text Recognizers☆21Jul 26, 2022Updated 3 years ago
- [MM2023] An official implement of the paper "One-stage Low-resolution Text Recognition with High-resolution Knowledge Transfer"☆16Nov 3, 2023Updated 2 years ago
- GRIT: Faster and Better Image-captioning Transformer (ECCV 2022)☆198May 9, 2023Updated 2 years ago
- Semantic Graph Representation Learning for Handwritten Mathematical Expression Recognition (ICDAR 2023)☆15Aug 29, 2023Updated 2 years ago
- This repository contains source codes for SoftCTC. Original paper can be found here: https://arxiv.org/abs/2212.02135☆19Mar 7, 2023Updated 2 years ago
- Image captioning with weight pruning in PyTorch☆22Jan 14, 2022Updated 4 years ago
- [ICCV 2023] With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning.☆19Jun 7, 2024Updated last year
- [CVPR 2022] This repository is for the paper ``DIFNet: Boosting Visual Information Flow for Image Captioning'' .☆21Nov 28, 2022Updated 3 years ago
- Official pytorch implementation of paper "Dual-Level Collaborative Transformer for Image Captioning" (AAAI 2021).☆202Jun 8, 2022Updated 3 years ago
- Implementation code of the work "Exploiting Multiple Sequence Lengths in Fast End to End Training for Image Captioning"☆94Dec 25, 2024Updated last year
- Official repository of the paper: "A Comprehensive Gold Standard and Benchmark for Comics Text Detection and Recognition"☆26Jul 10, 2023Updated 2 years ago
- A paper list of image captioning.☆22Apr 23, 2022Updated 3 years ago
- Progressive Transformer-Based Generation of Radiology Reports☆25Jan 5, 2025Updated last year
- PyTorch implementation of BMVC2022 paper Masked Vision-Language Transformers for Scene Text Recognition☆29Nov 11, 2022Updated 3 years ago
- Read Ten Lines at One Glance: Line-Aware Semi-Autoregressive Transformer for Multi-Line Handwritten Mathematical Expression Recognition☆28Aug 29, 2023Updated 2 years ago
- ☆26Feb 2, 2023Updated 3 years ago
- CaMEL: Mean Teacher Learning for Image Captioning. ICPR 2022☆29Dec 1, 2022Updated 3 years ago
- Implementation of the Object Relation Transformer for Image Captioning☆180Sep 17, 2024Updated last year
- It's the code for the paper Pushing the Performance Limit of Scene Text Recognizer without Human Annotation, CVPR 2022.☆28Jul 6, 2022Updated 3 years ago
- Implementation of 'End-to-End Transformer Based Model for Image Captioning' [AAAI 2022]☆69Jun 1, 2024Updated last year
- MixGen: A New Multi-Modal Data Augmentation☆126Jan 9, 2023Updated 3 years ago
- The implementation of multi-branch attentive Transformer (MAT).☆33Aug 27, 2020Updated 5 years ago
- ☆38Feb 4, 2023Updated 3 years ago
- A TensorFlow implementation of NRTR, a No-Recurrence Seq2Seq Model for Scene Text Recognition☆31Sep 1, 2019Updated 6 years ago
- ☆42Sep 2, 2023Updated 2 years ago
- Implementation of paper "Improving Image Captioning with Better Use of Caption"☆33Sep 15, 2020Updated 5 years ago
- Data Programming for Text Detection in Documents using SPEAR☆12Mar 26, 2025Updated 10 months ago
- [CVPR 2025] Adaptive Markup Language Generation for Contextually-Grounded Visual Document Understanding☆15Jun 16, 2025Updated 7 months ago
- Two technical workshops from GenAI Hackathon Cairo 2025: build LLM apps with RAG & tools , and create a coding AI agent from scratch (Pyt…☆34Jan 20, 2026Updated 3 weeks ago
- MCFL for Pedestrian attribute recognition☆13Jul 20, 2020Updated 5 years ago
- Implementation of various handwritten text line segmentation☆10Jan 6, 2020Updated 6 years ago
- ☆14Jan 9, 2025Updated last year
- Source code related to the research paper entitled RVENet: A Large Echocardiographic Dataset for the Deep Learning-Based Assessment of Ri…☆12Mar 10, 2024Updated last year
- Developer's public open source resource contribution repository.☆10Nov 6, 2022Updated 3 years ago