A visual LLM for image region description or QA.
☆16Jul 14, 2023Updated 2 years ago
Alternatives and similar repositories for Click4Caption
Users that are interested in Click4Caption are comparing it to the libraries listed below
Sorting:
- [ECCV 2022] Official pytorch implementation of "mc-BEiT: Multi-choice Discretization for Image BERT Pre-training" in European Conference …☆22Sep 13, 2022Updated 3 years ago
- A self-supervised learning approach based on extremely large masking☆31Dec 19, 2022Updated 3 years ago
- Official codes for ConMIM (ICLR 2023)☆58Feb 8, 2023Updated 3 years ago
- ☆12May 26, 2022Updated 3 years ago
- a recommendation list of math courses for people with no math background.☆11Mar 2, 2021Updated 4 years ago
- ☆92Nov 25, 2023Updated 2 years ago
- This is an official implementation of our CVPR 2020 paper "Non-Local Neural Networks With Grouped Bilinear Attentional Transforms".☆12Jan 30, 2021Updated 5 years ago
- ☆18Aug 20, 2024Updated last year
- evemu - Kernel device emulation☆10Oct 2, 2017Updated 8 years ago
- A2C, ACKTR and A2T implementations for ViZDoom☆10Dec 18, 2017Updated 8 years ago
- ☆15Apr 11, 2023Updated 2 years ago
- Attributes Recognition of Apparel☆10Jan 8, 2019Updated 7 years ago
- Some commonly used functions and modules☆10Jan 15, 2024Updated 2 years ago
- codes for ICML2021 paper iDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients☆10May 27, 2021Updated 4 years ago
- The Structure and Interpretation of Deep Networks Handbook☆14Dec 14, 2024Updated last year
- Seamlessly integrate IoT data with AI agents, enabling the effortless parsing, processing, and utilization of IoT data streams.☆10Jan 27, 2025Updated last year
- Q&A dataset for many-shot jailbreaking☆14Jul 19, 2024Updated last year
- Hand Written Blots augmentation☆12Aug 28, 2025Updated 6 months ago
- JoVA: Unified Multimodal Learning for Joint Video-Audio Generation☆30Dec 22, 2025Updated 2 months ago
- ☆13Apr 3, 2024Updated last year
- This repository contains the code for our papers "Learning Condition Invariant Features for Retrieval-Based Localization from 1M Images" …☆11Oct 25, 2020Updated 5 years ago
- A PyTorch implementation of Proxy Anchor Loss based on CVPR 2020 paper "Proxy Anchor Loss for Deep Metric Learning"☆11Jan 16, 2021Updated 5 years ago
- The Pytorch implementation for "Video-Text Pre-training with Learned Regions"☆43Jul 15, 2022Updated 3 years ago
- Weighted-Boxes-Fusion method implementation with YOLOv4 and YOLOv5☆11Jul 14, 2022Updated 3 years ago
- Pytorch ImageNet1k Loader with Bounding Boxes.☆13Jan 23, 2022Updated 4 years ago
- A sample client code for capturing panorama images by a modified AirSim☆14Aug 20, 2022Updated 3 years ago
- ☆13Jun 8, 2019Updated 6 years ago
- Collection of papers about video-audio understanding☆22Dec 26, 2025Updated 2 months ago
- Reinforcement Learning from Hierarchical Critics☆13Jul 30, 2020Updated 5 years ago
- [ICML2022] "Identity-Disentangled Adversarial Augmentation for Self-Supervised Learning"☆10Jul 24, 2022Updated 3 years ago
- Demo scripts for HPS Dataset (http://virtualhumans.mpi-inf.mpg.de/hps/)☆11Mar 10, 2025Updated 11 months ago
- Turning to Video for Transcript Sorting☆49Aug 27, 2023Updated 2 years ago
- This repository has used AI2THOR CVPR data set.☆13Dec 7, 2018Updated 7 years ago
- Pytorch implementation of our paper (TNNLS) -- Pruning Networks with Cross-Layer Ranking & k-Reciprocal Nearest Filters☆12Feb 24, 2022Updated 4 years ago
- Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and …☆13Aug 29, 2024Updated last year
- Concept-based generative models☆13Dec 13, 2024Updated last year
- Code for "Unsupervised Visuomotor Control through Distributional Planning Networks"☆10Jun 27, 2019Updated 6 years ago
- SIGIR paper Conversational Fashion Image Retrieval via Multiturn Natural Language Feedback☆14Oct 17, 2022Updated 3 years ago
- recognize chinese and english without segmentation☆11Aug 22, 2018Updated 7 years ago