[BMVC22] Official Implementation of ViCHA: "Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment"
☆54Oct 20, 2022Updated 3 years ago
Alternatives and similar repositories for ViCHA
Users that are interested in ViCHA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [TIP2023] The code of “Plug-and-Play Regulators for Image-Text Matching”☆34Apr 11, 2024Updated 2 years ago
- Code for paper: Unified Text-to-Image Generation and Retrieval☆16Jul 6, 2024Updated last year
- Context-Aware Multi-View Summarization Network for Image-Text Matching. ACM MM'20☆29May 26, 2022Updated 3 years ago
- [ICCV23] Official implementation of eP-ALM: Efficient Perceptual Augmentation of Language Models.☆27Oct 27, 2023Updated 2 years ago
- Pytorch-based tools for constructing a vocabulary of visual concepts in a GAN.☆17Feb 25, 2022Updated 4 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆28Aug 1, 2024Updated last year
- Weakly Supervised Grounding for VQA in Vision-Language Transformers☆16May 6, 2023Updated 2 years ago
- ☆91Apr 15, 2022Updated 3 years ago
- TTRV: Test-Time Reinforcement Learning for Vision–Language Models (CVPR 2026)☆37Mar 8, 2026Updated last month
- ☆37Oct 7, 2023Updated 2 years ago
- [WACV 2024] Instruct Me More! Random Prompting for Visual In-Context Learning☆17May 7, 2025Updated 11 months ago
- GAN(TK)²: GAN Neural Tangent Kernel ToolKit☆13Jul 12, 2022Updated 3 years ago
- TIER: Text-Image Encoder-based Regression for AIGC Image Quality Assessment☆10Mar 1, 2025Updated last year
- Implementation of our IJCAI2022 oral paper, ER-SAN: Enhanced-Adaptive Relation Self-Attention Network for Image Captioning.☆24Aug 5, 2023Updated 2 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning☆40Jul 29, 2023Updated 2 years ago
- BISON: Binary Image SelectiON☆49Sep 15, 2021Updated 4 years ago
- The efficient tuning method for VLMs☆82Mar 10, 2024Updated 2 years ago
- ☆73Jun 3, 2022Updated 3 years ago
- PyTorch implementation of Data2Vec self-supervised approach for vision use cases.☆18Oct 7, 2022Updated 3 years ago
- Official Code of ECCV 2022 paper MS-CLIP☆91Jul 27, 2022Updated 3 years ago
- ☆45Aug 14, 2023Updated 2 years ago
- PyTorch implementation of HANet: Hierarchical Alignment Networks for Video-Text Retrieval (ACM MM 2021).☆47Aug 19, 2021Updated 4 years ago
- Release of ImageNet-Captions☆51Jan 20, 2023Updated 3 years ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- Official implementation for the paper "Prompt Pre-Training with Over Twenty-Thousand Classes for Open-Vocabulary Visual Recognition"☆259May 3, 2024Updated last year
- WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs☆46Apr 1, 2026Updated last week
- ☆13Jun 3, 2024Updated last year
- [SIGIR 2022] CenterCLIP: Token Clustering for Efficient Text-Video Retrieval.☆134May 4, 2022Updated 3 years ago
- Siamese network for unsupervised speech representation learning☆11Oct 12, 2018Updated 7 years ago
- [ICLR2024] (EvALign-ICL Benchmark) Beyond Task Performance: Evaluating and Reducing the Flaws of Large Multimodal Models with In-Context …☆22Mar 1, 2024Updated 2 years ago
- An official pytorch implementation of the paper: [MV-Adapter: Multimodal Video Transfer Learning for Video Text Retrieval].☆14Jul 27, 2024Updated last year
- PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)☆211Dec 18, 2022Updated 3 years ago
- ☆10Jan 9, 2025Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks (NeurIPS2022)☆85Nov 2, 2022Updated 3 years ago
- USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text Retrieval, TIP 2024☆33Jun 18, 2025Updated 9 months ago
- Finetuning & extending DiffusionDet to video & pedestrian multi-object-tracking☆13Apr 12, 2023Updated 3 years ago
- Ask&Confirm: Active Detail Enriching for Cross-Modal Retrieval with Partial Query (ICCV2021)☆20Dec 4, 2021Updated 4 years ago
- The source code of the paper: "To Find Where You Talk: Temporal Sentence Localization in Video with Attention Based Location Regression"☆30Jan 8, 2019Updated 7 years ago
- Learning to Initialize Neural Networks for Stable and Efficient Training☆138May 24, 2022Updated 3 years ago
- Channel Equilibrium Networks for Learning Deep Representation, ICML2020☆22Jul 28, 2020Updated 5 years ago