Official PyTorch codebase for the Modeling Caption Diversity in ContrastiveVision-Language Pretraining paper.
☆18Mar 28, 2025Updated last year
Alternatives and similar repositories for Llip
Users that are interested in Llip are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code accompanying paper "SharpDepth: Sharpening Metric Depth Predictions Using Diffusion Distillation"☆25Apr 8, 2026Updated last week
- EgoToM is an egocentric theory-of-mind benchmark built on Ego4D videos, containing multi-choice questions that evaluate multimodal large …☆14Apr 1, 2025Updated last year
- A curated list of Survey Papers on Deep Learning.☆12Sep 5, 2023Updated 2 years ago
- [NAACL 2024] Z-GMOT: Zero-shot Generic Multiple Object Tracking☆13May 3, 2024Updated last year
- [CVPR2025] Official implementation of the paper "Multi-Layer Visual Feature Fusion in Multimodal LLMs: Methods, Analysis, and Best Practi…☆47Oct 29, 2025Updated 5 months ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Sample projects to showcase the Unity Meta XR Interaction SDK.☆45Feb 28, 2026Updated last month
- Official PyTorch implementation of our CVPR 2025 paper: "SwiftEdit: Lightning Fast Text-guided Image Editing via One-step Diffusion"☆45Jan 7, 2026Updated 3 months ago
- BigOBench assesses the capacity of Large Language Models (LLMs) to comprehend time-space computational complexity of input or generated c…☆40Apr 15, 2025Updated last year
- Retrieval_OOD_for_Multimodal_AI☆11Dec 4, 2024Updated last year
- Code of "Robustifying Token Attention for Vision Transformers"☆20Dec 31, 2023Updated 2 years ago
- ☆18Nov 19, 2024Updated last year
- ☆28Apr 8, 2025Updated last year
- Official codes of the 1st place for The NVIDIA AI City Challenge 2023 - Track 2☆19Jul 25, 2023Updated 2 years ago
- [EMNLP 2024 Main] Official implementation of the paper "To Preserve or To Compress: An In-Depth Study of Connector Selection in Multimoda…☆17Dec 13, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [ICCV 2023] Simple Baselines for Interactive Video Retrieval with Questions and Answers☆19Apr 16, 2024Updated 2 years ago
- ☆14Jan 5, 2022Updated 4 years ago
- ☆11May 1, 2023Updated 2 years ago
- Multimodal_AI_Video_Dialogue☆16Dec 3, 2024Updated last year
- Global-Local Attention for Emotion Recognition☆20Nov 13, 2020Updated 5 years ago
- Code for the paper "Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers" [ICCV 2025]☆101Jul 28, 2025Updated 8 months ago
- ☆13Nov 7, 2021Updated 4 years ago
- My PhD manuscript LaTeX code and the slides for the defense☆11Feb 2, 2022Updated 4 years ago
- ☆28Mar 13, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- The official code for ICCV 2023 paper "Reconstructing Groups of People with Hypergraph Relational Reasoning"☆12Jul 4, 2025Updated 9 months ago
- ☆15May 7, 2024Updated last year
- ICML 2025 Oral: ABKD: Pursuing a Proper Allocation of the Probability Mass in Knowledge Distillation via α-β-Divergence☆45Aug 8, 2025Updated 8 months ago
- This repository contains code for deploying a Gradio application using the SAM2 model for video processing. The application allows users …☆47Sep 24, 2024Updated last year
- X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization, CVPR 2024☆11Nov 7, 2024Updated last year
- In this project, facial recognition algorithm is implemented with python using PCA and SVD dimensionality reduction tools.☆10Sep 2, 2019Updated 6 years ago
- [ECCV 2024] Official Release of SILC: Improving vision language pretraining with self-distillation☆48Oct 3, 2024Updated last year
- Official Release of NeurIPS 2024 paper "Slot State Space Models"☆11Mar 22, 2025Updated last year
- Text Query based Traffic Video Event Retrieval with Global-Local Fusion Embedding☆13Aug 2, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- GPT-style network for phonemization with durations of text☆68Mar 21, 2024Updated 2 years ago
- Base repo for paper 'StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval'☆14Apr 27, 2022Updated 3 years ago
- ACM Multimedia 2023 (Oral) - RTQ: Rethinking Video-language Understanding Based on Image-text Model☆16Apr 7, 2026Updated last week
- Implementation of Recurrent Hidden Semi-Markov Model http://www.cc.gatech.edu/~lsong/papers/DaiDaiZhaLietal17.pdf☆12Mar 31, 2019Updated 7 years ago
- Directed masked autoencoders☆14Mar 25, 2026Updated 3 weeks ago
- [ECCV'24] Official code for "BI-MDRG: Bridging Image History in Multimodal Dialogue Response Generation"☆43Nov 19, 2024Updated last year
- ☆35Apr 9, 2026Updated last week