ChenAnno / SPIRIT_TOMM2024Links
Official implementation for "SPIRIT: Style-guided Patch Interaction for Fashion Image Retrieval with Text Feedback"
☆17Updated last year
Alternatives and similar repositories for SPIRIT_TOMM2024
Users that are interested in SPIRIT_TOMM2024 are comparing it to the libraries listed below
Sorting:
- Official implementation for "FashionERN: Enhance-and-Refine Network for Composed Fashion Image Retrieval"☆19Updated last year
- Official implementation for "Real20M: A Large-scale E-commerce Dataset for Cross-domain Retrieval"☆27Updated last year
- ☆71Updated 5 months ago
- [TIP 2022] Official code of paper “Video Question Answering with Prior Knowledge and Object-sensitive Learning”☆46Updated last year
- [IJCAI 2022] Official Pytorch code for paper “S2 Transformer for Image Captioning”☆84Updated last year
- 关于LLM和Multimodal LLM的paper list☆42Updated 2 weeks ago
- Collection of Composed Image Retrieval (CIR) papers.☆253Updated last week
- [SIGIR 2024] - Simple but Effective Raw-Data Level Multimodal Fusion for Composed Image Retrieval☆41Updated last year
- A Framework of Continual Learning☆117Updated last week
- Latest Advances on (RL based) Multimodal Reasoning and Generation in Multimodal Large Language Models☆33Updated last week
- ☆136Updated 6 months ago
- ☆94Updated 2 years ago
- Repository for an end-to-end image captioning method PTSN(ACM MM22).☆60Updated 2 years ago
- ☆49Updated 9 months ago
- [NeurIPS2023] Exploring Diverse In-Context Configurations for Image Captioning☆40Updated 9 months ago
- [ICML 2024] Official implementation for "HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding"☆95Updated 8 months ago
- Codes of the Fine-grained Textual Inversion network for Zero-Shot Composed Image Retrieval☆25Updated 4 months ago
- A summarization of zero-shot image recognition methods, in the perspective of element-wise representation and reasoning , covering public…☆19Updated 10 months ago
- [CVPR25 Highlight] A ChatGPT-Prompted Visual hallucination Evaluation Dataset, featuring over 100,000 data samples and four advanced eval…☆20Updated 4 months ago
- [NeurIPS 2022 Spotlight] Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations☆139Updated last year
- This is the first released survey paper on hallucinations of large vision-language models (LVLMs). To keep track of this field and contin…☆75Updated last year
- [CVPR 2024] How to Configure Good In-Context Sequence for Visual Question Answering☆19Updated 2 months ago
- [ICCV 2023] Simple Baselines for Interactive Video Retrieval with Questions and Answers☆16Updated last year
- XCurve is an end-to-end PyTorch library for X-Curve metrics optimizations in machine learning.☆142Updated last year
- CPL: Weakly Supervised Temporal Sentence Grounding with Gaussian-based Contrastive Proposal Learning☆64Updated last year
- Papers about Hallucination in Multi-Modal Large Language Models (MLLMs)☆94Updated 9 months ago
- NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)☆168Updated 3 weeks ago
- The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆218Updated last year
- Official github repo for ICCV2023 paper 'Multi-event Video-Text Retrieval'☆18Updated last year
- [ICML 2024] "Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models"☆55Updated 11 months ago