[TCSVT23] Official code for "SPT: Spatial Pyramid Transformer for Image Captioning".
☆10Aug 14, 2024Updated last year
Alternatives and similar repositories for SPT
Users that are interested in SPT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆11Jul 11, 2023Updated 2 years ago
- [TIP 2022] Official code of paper “Video Question Answering with Prior Knowledge and Object-sensitive Learning”☆46Jan 27, 2024Updated 2 years ago
- [IJCAI 2022] Official Pytorch code for paper “S2 Transformer for Image Captioning”☆87Aug 14, 2024Updated last year
- (ACL 2025) 🔥🔥🔥Code for "Empowering Multimodal Large Language Models with Evol-Instruct"☆20May 15, 2025Updated 10 months ago
- Awesome multi-modal large language paper/project, collections of popular training strategies, e.g., PEFT, LoRA.☆27Aug 2, 2024Updated last year
- Official implementation of the CVPR '25 highlight paper "Compositional Caching for Training-free Open-vocabulary Attribute Detection"☆23Dec 23, 2024Updated last year
- ☆20Jul 22, 2024Updated last year
- 🌠用PySimpleGUI实现一个简易的分布式计算系统——简易多机协同计算原型系统(Simply Multi-Machine Collaborative Computing)☆11May 26, 2020Updated 5 years ago
- AVX512 population count routines☆23Aug 2, 2019Updated 6 years ago
- Rich Visual Knowledge-based AugmentationNetwork for Visual Question Answering☆10Dec 6, 2019Updated 6 years ago
- Local self-attention in Transformer for visual question answering☆13Mar 17, 2024Updated 2 years ago
- Official implementation of "What does CLIP know about a red circle? Visual Prompt Engineering for VLMs", ICCV 2023☆11Sep 21, 2023Updated 2 years ago
- A survey on MM-LLMs for long video understanding: From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long…☆18Sep 12, 2025Updated 6 months ago
- ☆14Jan 5, 2022Updated 4 years ago
- Extension of hLSTMat☆19Apr 15, 2021Updated 4 years ago
- This repository contains code for the paper 'Dual-branch Hybrid Learning Network for Unbiased Scene Graph Generation'.☆18Aug 6, 2022Updated 3 years ago
- 实现对携程网站的酒店评论爬取,并进行数据预处理和基于情感分类的数据分析,使用了jieba评论分词等处理技术,情感词典,特征值提取,机器学习模型等分 析预测技术,词云,热力图等可视化技术☆13Jul 15, 2022Updated 3 years ago
- ☆19Dec 16, 2020Updated 5 years ago
- GPT Demo with hybrid distributed training☆10Dec 1, 2022Updated 3 years ago
- Repository for an end-to-end image captioning method PTSN(ACM MM22).☆60Dec 11, 2022Updated 3 years ago
- ☆31Dec 6, 2025Updated 3 months ago
- ☆21Mar 1, 2022Updated 4 years ago
- A naïve Bayesian spam filter in Python☆10Dec 18, 2019Updated 6 years ago
- ☆30Dec 14, 2025Updated 3 months ago
- Marathon: A Multiple-choice Long Context Evaluation Benchmark for Large Language Models.☆10May 16, 2024Updated last year
- [TIP25] Code for "Text-Video Retrieval with Global-Local Semantic Consistent Learning"☆14May 12, 2025Updated 10 months ago
- Official pytorch implementation of CVPR2023 paper "Learning Conditional Attributes for Compositional Zero-Shot Learning"☆18Oct 19, 2025Updated 5 months ago
- Official implementation of "In-style: Bridging Text and Uncurated Videos with Style Transfer for Cross-modal Retrieval." ICCV 2023☆11Oct 5, 2023Updated 2 years ago
- The implementation of Learning Instance and Task-Aware Dynamic Kernels for Few Shot Learning☆13Apr 14, 2024Updated last year
- SpringCloud微服务入门教程,包含Eureka注册发现、Config配置中心、BUS消息总线、FeignClient客户端 、Zuul网关、Hystrix服务熔断降级、Stream消息队列、Sleuth链路监控、Swagger文档的基本整合演示。☆11Aug 26, 2024Updated last year
- ☆25Oct 9, 2025Updated 5 months ago
- [ Arxiv 2023 ] This repository contains the code for "MUPPET: Multi-Modal Few-Shot Temporal Action Detection"☆15Aug 30, 2023Updated 2 years ago
- This repository contains the Adverbs in Recipes (AIR) dataset and the code published at the CVPR 23 paper: "Learning Action Changes by Me…☆13May 25, 2023Updated 2 years ago
- ☆13Nov 28, 2021Updated 4 years ago
- Embodied Instruction Following in Unknown Environments☆17Dec 8, 2025Updated 3 months ago
- Talk to ChatGPT and Generate image via any Matrix client!☆16Apr 25, 2023Updated 2 years ago
- Submission Guide + Discussion Board for AI Singapore Global Challenge for Safe and Secure LLMs (Track 1A).☆16Jul 4, 2024Updated last year
- [NeurIPS 2025] An official source code for paper "L-MTP: Leap Multi-Token Prediction Beyond Adjacent Context for Large Language Models"☆24Oct 29, 2025Updated 4 months ago
- [CVPR 2023]Official Pytorch code for paper "Prototype-based Embedding Network for Scene Graph Generation"☆59Jun 8, 2023Updated 2 years ago