This is a public repository for Image Clustering Conditioned on Text Criteria (IC|TC)
☆92Mar 19, 2024Updated last year
Alternatives and similar repositories for ICTC
Users that are interested in ICTC are comparing it to the libraries listed below
Sorting:
- Project for SNARE benchmark☆11Jun 5, 2024Updated last year
- Official Pytorch Implementation of Paper "A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Des…☆55Aug 27, 2025Updated 6 months ago
- Censored Sampling of Diffusion Models Using 3 Minutes of Human Feedback☆26Aug 27, 2023Updated 2 years ago
- How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges☆30Sep 24, 2023Updated 2 years ago
- This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.☆19Jun 27, 2024Updated last year
- [ECAI 2023] MonoSKD: General Distillation Framework for Monocular 3D Object Detection via Spearman Correlation Coefficient☆32Dec 8, 2023Updated 2 years ago
- HIRL: A General Framework for Hierarchical Image Representation Learning (http://arxiv.org/abs/2205.13159)☆40Jul 7, 2022Updated 3 years ago
- ☆35Jan 23, 2024Updated 2 years ago
- Code for this paper "HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts via HyperNetwork"☆33Nov 29, 2023Updated 2 years ago
- Code for the paper "Image Clustering with External Guidance" (ICML 2024)☆17Oct 26, 2024Updated last year
- 2D road segmentation using lidar data during training☆43Dec 21, 2023Updated 2 years ago
- Code and data for CoachLM, an automatic instruction revision approach LLM instruction tuning.☆60Mar 20, 2024Updated last year
- ☆21Nov 9, 2025Updated 3 months ago
- 🪞A powerful toolkit for almost all the Information Extraction tasks.☆124Apr 21, 2025Updated 10 months ago
- ☆43Sep 19, 2024Updated last year
- NeurIPS2023 Spotlight: Masked Space-time Hashing☆92Oct 27, 2023Updated 2 years ago
- ☆25Sep 19, 2023Updated 2 years ago
- SVL-Adapter: Self-Supervised Adapter for Vision-Language Pretrained Models☆21Jan 11, 2024Updated 2 years ago
- Rare-to-Frequent (R2F), ICLR'25, Spotlight☆53Apr 23, 2025Updated 10 months ago
- [IJCV 2024] MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation☆128Oct 8, 2024Updated last year
- Code for "CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning"☆32Mar 26, 2025Updated 11 months ago
- ☆11Aug 7, 2025Updated 6 months ago
- ☆10Jul 5, 2024Updated last year
- Modern, type-safe, zero-dependency Python library for serial port I/O access☆23Dec 16, 2025Updated 2 months ago
- ☆13Aug 14, 2022Updated 3 years ago
- (NeurIPS 2024) What Makes CLIP More Robust to Long-Tailed Pre-Training Data? A Controlled Study for Transferable Insights☆28Oct 28, 2024Updated last year
- Simple program to manually caption your images (or any other file types) so you can use them for AI training☆37Mar 20, 2023Updated 2 years ago
- Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?☆15Jun 3, 2025Updated 9 months ago
- [ICCV 2023] Going Beyond Nouns With Vision & Language Models Using Synthetic Data☆14Sep 30, 2023Updated 2 years ago
- ☆11Sep 15, 2023Updated 2 years ago
- A comprehensive overview of Data Distillation and Condensation (DDC). DDC is a data-centric task where a representative (i.e., small but …☆13Dec 1, 2022Updated 3 years ago
- ☆13May 12, 2025Updated 9 months ago
- Benchmarking Multi-Image Understanding in Vision and Language Models☆12Jul 29, 2024Updated last year
- Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"☆37Aug 18, 2024Updated last year
- Metrically Scaled Monocular Depth Estimation through Sparse Priors for Underwater Robots☆50Jul 29, 2025Updated 7 months ago
- Pytorch code for paper From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models☆206Jan 8, 2025Updated last year
- Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!☆11May 24, 2023Updated 2 years ago
- ☆11Sep 28, 2023Updated 2 years ago
- Implementation of "DIME-FM: DIstilling Multimodal and Efficient Foundation Models"☆15Oct 12, 2023Updated 2 years ago