[TCSVT23] Official code for "SPT: Spatial Pyramid Transformer for Image Captioning".
☆10Aug 14, 2024Updated last year
Alternatives and similar repositories for SPT
Users that are interested in SPT are comparing it to the libraries listed below
Sorting:
- ☆11Jul 11, 2023Updated 2 years ago
- [TIP 2022] Official code of paper “Video Question Answering with Prior Knowledge and Object-sensitive Learning”☆46Jan 27, 2024Updated 2 years ago
- [IJCAI 2022] Official Pytorch code for paper “S2 Transformer for Image Captioning”☆87Aug 14, 2024Updated last year
- (ACL 2025) 🔥🔥🔥Code for "Empowering Multimodal Large Language Models with Evol-Instruct"☆20May 15, 2025Updated 9 months ago
- Awesome multi-modal large language paper/project, collections of popular training strategies, e.g., PEFT, LoRA.☆26Aug 2, 2024Updated last year
- Official implementation of the CVPR '25 highlight paper "Compositional Caching for Training-free Open-vocabulary Attribute Detection"☆23Dec 23, 2024Updated last year
- ☆20Jul 22, 2024Updated last year
- Local self-attention in Transformer for visual question answering☆13Mar 17, 2024Updated last year
- 实现对携程网站的酒店评论爬取,并进行数据预处理和基于情感分类的数据分析,使用了jieba评论分词等处理技术,情感词典,特征值提取,机器学习模型等分析预测技术,词云,热力图等可视化技术☆13Jul 15, 2022Updated 3 years ago
- [ICLR 2026] Thinking on the Fly: Test-Time Reasoning Enhancement via Latent Thought Policy Optimization☆18Feb 14, 2026Updated 2 weeks ago
- The implementation of Learning Instance and Task-Aware Dynamic Kernels for Few Shot Learning☆13Apr 14, 2024Updated last year
- [NeurIPS 2025] An official source code for paper "L-MTP: Leap Multi-Token Prediction Beyond Adjacent Context for Large Language Models"☆23Oct 29, 2025Updated 4 months ago
- A naïve Bayesian spam filter in Python☆10Dec 18, 2019Updated 6 years ago
- ☆13Nov 28, 2021Updated 4 years ago
- The Pytorch implemetation of "FeatWalk: Enhancing Few-Shot Classification through Local View Leveraging", AAAI 2024.☆11Mar 4, 2024Updated 2 years ago
- Regularly Truncated M-estimators for Learning with Noisy Labels☆11Apr 24, 2024Updated last year
- [ICCV 2025] Boosting Multi-View Indoor 3D Object Detection via Adaptive 3D Volume Construction☆23Oct 1, 2025Updated 5 months ago
- Marathon: A Multiple-choice Long Context Evaluation Benchmark for Large Language Models.☆10May 16, 2024Updated last year
- GPT Demo with hybrid distributed training☆10Dec 1, 2022Updated 3 years ago
- Official implementation of "In-style: Bridging Text and Uncurated Videos with Style Transfer for Cross-modal Retrieval." ICCV 2023☆11Oct 5, 2023Updated 2 years ago
- Official PyTorch implementation of our CVPR 2025 paper, "LoRA Subtraction for Drift-Resistant Space in Exemplar-Free Continual Learning."☆16Mar 28, 2025Updated 11 months ago
- ☆14Jan 5, 2022Updated 4 years ago
- [ICCV 2025] Repository for A Quality-Guided Mixture of Score-fusion Experts Framework for Human Recognition☆16Sep 29, 2025Updated 5 months ago
- ☆24Oct 9, 2025Updated 4 months ago
- Embodied Instruction Following in Unknown Environments☆17Dec 8, 2025Updated 2 months ago
- SpringCloud微服务入门教程,包含Eureka注册发现、Config配置中心、BUS消息总线、FeignClient客户端 、Zuul网关、Hystrix服务熔断降级、Stream消息队列、Sleuth链路监控、Swagger文档的基本整合演示。☆11Aug 26, 2024Updated last year
- ☆12Jan 17, 2024Updated 2 years ago
- Official implementation of "What does CLIP know about a red circle? Visual Prompt Engineering for VLMs", ICCV 2023☆11Sep 21, 2023Updated 2 years ago
- Rich Visual Knowledge-based AugmentationNetwork for Visual Question Answering☆10Dec 6, 2019Updated 6 years ago
- Official PyTorch implementation for the paper "Interpretable Image Classification via Non-parametric Part Prototype Learning" CVPR 2025.☆26Jun 10, 2025Updated 8 months ago
- [AAAI25] Implementation of paper "WiFi Temporal Activity Detection via Dual Pyramid Network"☆14Aug 26, 2025Updated 6 months ago
- ☆12Mar 30, 2023Updated 2 years ago
- ☆14Apr 1, 2023Updated 2 years ago
- [ Arxiv 2023 ] This repository contains the code for "MUPPET: Multi-Modal Few-Shot Temporal Action Detection"☆15Aug 30, 2023Updated 2 years ago
- 🌠用PySimpleGUI实现一个简易的分布式计算系统——简易多机协同计算原型系统(Simply Multi-Machine Collaborative Computing)☆11May 26, 2020Updated 5 years ago
- [AAAI2023] Revisiting the Spatial and Temporal Modeling for Few-shot Action Recognition (SloshNet)☆13Jan 10, 2024Updated 2 years ago
- ☆10Aug 21, 2022Updated 3 years ago
- This repository contains the Adverbs in Recipes (AIR) dataset and the code published at the CVPR 23 paper: "Learning Action Changes by Me…☆13May 25, 2023Updated 2 years ago
- [ICCV'23] PAINet: Parallel Attention Interaction Network for Few-shot Skeleton-based Action Recognition☆11Oct 14, 2023Updated 2 years ago