☆14Dec 31, 2024Updated last year
Alternatives and similar repositories for VIPCAP
Users that are interested in VIPCAP are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [EMNLP 2024] IFCap: Image-like Retrieval and Frequency-based Entity Filtering for Zero-shot Captioning☆14May 13, 2025Updated last year
- ☆10Jul 5, 2024Updated last year
- SmallCap: Lightweight Image Captioning Prompted with Retrieval Augmentation☆125Feb 13, 2024Updated 2 years ago
- ☆12May 3, 2024Updated 2 years ago
- Nearest Neighbor Normalization (EMNLP 2024)☆21Nov 1, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Cross-modal Active Complementary Learning with Self-refining Correspondence (NeurIPS 2023, Pytorch Code)☆15Jun 6, 2024Updated 2 years ago
- [ACL Main 2025] I0T: Embedding Standardization Method Towards Zero Modality Gap☆12Jun 18, 2025Updated last year
- [CVPR 2024] Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension☆63Apr 8, 2024Updated 2 years ago
- Source code of our AAAI 2024 paper "Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval"☆55Mar 28, 2024Updated 2 years ago
- Implementation of our paper, 'Unifying Two-Stream Encoders with Transformers for Cross-Modal Retrieval.'☆27Dec 3, 2023Updated 2 years ago
- Awesome Vision-Language Compositionality, a comprehensive curation of research papers in literature.☆40Feb 13, 2025Updated last year
- ☆20Feb 20, 2025Updated last year
- THE ART of MULTIPROCESSOR PROGRAMMING, Maurice Herlihy & Nir Shavit☆11Feb 12, 2023Updated 3 years ago
- Code for "CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning"☆33Mar 26, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆22Apr 27, 2024Updated 2 years ago
- USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text Retrieval, TIP 2024☆33Jun 18, 2025Updated last year
- Project to provide driver guidance through object recognition in the vehicle driving environment: Display bounding boxes on objects in im…☆20Aug 25, 2024Updated last year
- ☆42Mar 28, 2024Updated 2 years ago
- [ICLR 2025] TempMe: Video Temporal Token Merging for Efficient Text-Video Retrieval☆27Feb 13, 2025Updated last year
- Code and data setup for the paper "Are Diffusion Models Vision-and-language Reasoners?"☆33Mar 15, 2024Updated 2 years ago
- An official pytorch implementation of the paper: [MV-Adapter: Multimodal Video Transfer Learning for Video Text Retrieval].☆14Jul 27, 2024Updated last year
- The official code and model for ACL 2023 paper 'mCLIP: Multilingual CLIP via Cross-lingual Transfer'☆10Jan 23, 2024Updated 2 years ago
- ☆10Apr 7, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [CVPR2025] Official code repository for SeTa: "Scale Efficient Training for Large Datasets"☆24Mar 18, 2025Updated last year
- ☆31Jun 10, 2026Updated 2 weeks ago
- ☆14Oct 14, 2019Updated 6 years ago
- ☆13Feb 13, 2025Updated last year
- ☆14Jul 13, 2024Updated last year
- Benchmarking Retrieval-Augmented Generation in Multi-Turn Legal Consultation Conversation☆48Mar 3, 2025Updated last year
- Automatic Generation of Scaffolding Questions for Learning Math, EMNLP 2022. RL, REINFORCE☆25Jun 30, 2023Updated 3 years ago
- ☆12Apr 19, 2024Updated 2 years ago
- Code repository for "Post-pre-training for Modality Alignment in Vision-Language Foundation Models" (CVPR2025)☆41Jul 25, 2025Updated 11 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆19Mar 5, 2024Updated 2 years ago
- [2024] INSANet: INtra-INter Spectral Attention Network for Effective Feature Fusion of Multispectral Pedestrian Detection, Sensors.☆23Mar 20, 2024Updated 2 years ago
- [NeurIPS '24] Frustratingly easy Test-Time Adaptation of VLMs!!☆64Mar 24, 2025Updated last year
- (CVPR2024) MeaCap: Memory-Augmented Zero-shot Image Captioning☆56Aug 16, 2024Updated last year
- SotA text-only image/video method (IJCAI 2023)☆15Jan 9, 2024Updated 2 years ago
- [ICCV 2023] ViLLA: Fine-grained vision-language representation learning from real-world data☆45Oct 15, 2023Updated 2 years ago
- ☆28Jul 9, 2025Updated 11 months ago