Code and data for the paper "Emergent Visual-Semantic Hierarchies in Image-Text Representations" (ECCV 2024)
☆34Aug 12, 2024Updated last year
Alternatives and similar repositories for hierarcaps
Users that are interested in hierarcaps are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Hyperbolic Safety-Aware Vision-Language Models. CVPR 2025☆31Apr 8, 2025Updated last year
- ☆57Aug 16, 2025Updated 10 months ago
- This repository is related to 'Intriguing Properties of Hyperbolic Embeddings in Vision-Language Models', published at TMLR (2024), https…☆21Jul 5, 2024Updated last year
- Code for the paper "Hyperbolic Image-Text Representations", Desai et al, ICML 2023☆204Aug 23, 2023Updated 2 years ago
- ☆17Nov 29, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [EMNLP 2024] Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality☆22Oct 8, 2024Updated last year
- Code for "CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning"☆33Mar 26, 2025Updated last year
- [AAAI2024] Code Release of CLIM: Contrastive Language-Image Mosaic for Region Representation☆30Feb 4, 2024Updated 2 years ago
- TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models☆40Nov 10, 2024Updated last year
- ☆21Aug 20, 2024Updated last year
- (NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment☆29Sep 27, 2024Updated last year
- ☆11Oct 2, 2024Updated last year
- ☆69Jun 2, 2026Updated last week
- [AAAI 2026 Oral] The official code of "UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning"☆74Dec 8, 2025Updated 6 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Official Pytorch implementation of "Improved Probabilistic Image-Text Representations" (ICLR 2024)☆63May 26, 2024Updated 2 years ago
- ☆18Mar 19, 2026Updated 2 months ago
- Interpreting and Analyzing CLIP's Zero-Shot Image Classification via Mutual Knowledge, NeurIPS 2024☆18Jun 27, 2025Updated 11 months ago
- [NeurIPS 2024] Official PyTorch implementation of LoTLIP: Improving Language-Image Pre-training for Long Text Understanding☆50Jan 14, 2025Updated last year
- Official Implementation for "MyVLM: Personalizing VLMs for User-Specific Queries" (ECCV 2024)☆188Jul 5, 2024Updated last year
- ☆46Dec 16, 2025Updated 6 months ago
- [NeurIPS 2025] Scaling Language-centric Omnimodal Representation Learning☆43Apr 13, 2026Updated 2 months ago
- ☆14May 23, 2022Updated 4 years ago
- Open-Vocabulary Video Question Answering: A New Benchmark for Evaluating the Generalizability of Video Question Answering Models (ICCV 20…☆18Apr 23, 2024Updated 2 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- [NeurIPS24] VisMin: Visual Minimal-Change Understanding☆19Mar 3, 2025Updated last year
- [RAL 2024] OpenObj: Open-Vocabulary Object-Level Neural Radiance Fields with Fine-Grained Understanding☆32Feb 17, 2025Updated last year
- [NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"☆48Dec 1, 2024Updated last year
- ☆44Mar 8, 2021Updated 5 years ago
- [ECCV 2024] Official PyTorch implementation of LUT "Learning with Unmasked Tokens Drives Stronger Vision Learners"☆14Dec 1, 2024Updated last year
- Developer project for getting basic API integrations working in under 5 minutes☆11May 22, 2026Updated 3 weeks ago
- [NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models☆159Dec 9, 2024Updated last year
- LLMBind: A Unified Modality-Task Integration Framework☆19Jun 16, 2024Updated last year
- ☆15Feb 24, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- repository for "Exploiting Proximity-Aware Tasks for Embodied Social Navigation" paper code☆12Nov 16, 2023Updated 2 years ago
- ☆53Dec 13, 2024Updated last year
- Implementation of "Conditional Score Guidance for Text-Driven Image-to-Image Translation" (NeurIPS 2023).☆11Jul 19, 2023Updated 2 years ago
- Official Implementation of ISR-DPO:Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective DPO (AAAI'25)☆23Nov 25, 2025Updated 6 months ago
- [AAAI 2025] Official data and code for "TB-HSU: Hierarchical 3D Scene Understanding with Contextual Affordances"☆15Sep 11, 2025Updated 9 months ago
- [CVPR23 Highlight] CREPE: Can Vision-Language Foundation Models Reason Compositionally?☆35Apr 27, 2023Updated 3 years ago
- Official pytorch repository for "Knowing Where to Focus: Event-aware Transformer for Video Grounding" (ICCV 2023)☆55Sep 7, 2023Updated 2 years ago