☆22Apr 24, 2025Updated last year
Alternatives and similar repositories for icons
Users that are interested in icons are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Dataset Quantization with Active Learning based Adaptive Sampling [ECCV 2024]☆10Jul 9, 2024Updated last year
- Official Implementation of paper "Distilling Long-tailed Datasets" [CVPR 2025]☆21Aug 13, 2025Updated 8 months ago
- Less is More: High-value Data Selection for Visual Instruction Tuning☆18Jan 18, 2025Updated last year
- COLA: Evaluate how well your vision-language model can Compose Objects Localized with Attributes!☆25Nov 23, 2024Updated last year
- MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision☆28May 26, 2025Updated 11 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆11Jul 30, 2025Updated 9 months ago
- [CVPR 2025] An Implementation of the paper "Pre-Instruction Data Selection for Visual Instruction Tuning"☆17Jun 9, 2025Updated 10 months ago
- ☆28Oct 18, 2022Updated 3 years ago
- ☆51Oct 29, 2023Updated 2 years ago
- ☆13Jul 2, 2025Updated 10 months ago
- Compress conventional Vision-Language Pre-training data☆52Sep 22, 2023Updated 2 years ago
- ☆20Apr 23, 2024Updated 2 years ago
- NegCLIP.☆40Feb 6, 2023Updated 3 years ago
- Fast, free, easy, and object-agnostic video anonymization☆12Dec 12, 2020Updated 5 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Official repo for the TMLR paper "Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners"☆29Apr 27, 2024Updated 2 years ago
- CoCoFL: Communication- and Computation-Aware Federated Learning via Partial NN Freezing and Quantization☆13Aug 3, 2024Updated last year
- Code for ORAR Agent for Vision and Language Navigation on Touchdown and map2seq☆20Nov 3, 2023Updated 2 years ago
- Fine tune LLaVA 1.5 - based on article by wandb☆13Feb 19, 2024Updated 2 years ago
- Code for T-MARS data filtering☆35Aug 23, 2023Updated 2 years ago
- Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"☆38Aug 18, 2024Updated last year
- [CVPR 2024] Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding☆56Apr 7, 2025Updated last year
- Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training☆141Dec 16, 2025Updated 4 months ago
- ☆10Mar 18, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Rui Qian, Xin Yin, Chuanhang Deng, et al.: UGround: Towards Unified Visual Grounding with Unrolled Transformers (ICML 2026)☆22Updated this week
- ☆12Mar 22, 2025Updated last year
- [ICCV 2023] Simple Baselines for Interactive Video Retrieval with Questions and Answers☆19Apr 16, 2024Updated 2 years ago
- Code, Data and Red Teaming for ZeroBench☆59Dec 23, 2025Updated 4 months ago
- Repository for "Who Plays First? Optimizing the Order of Play in Stackelberg Games with Many Robots" - RSS 2024☆18Jun 25, 2024Updated last year
- ☆39Jan 12, 2026Updated 3 months ago
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement☆147Apr 15, 2026Updated 2 weeks ago
- VGDFR: Diffuison-based Video Generation with Dynamic Frame Rate☆17May 16, 2025Updated 11 months ago
- Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types☆32Jul 16, 2025Updated 9 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Generating Image Specific Text☆29Aug 14, 2023Updated 2 years ago
- PyTorch implementation of PTQ4DiT https://arxiv.org/abs/2405.16005☆46Nov 8, 2024Updated last year
- ☆114Mar 14, 2024Updated 2 years ago
- ☆15May 28, 2024Updated last year
- ☆17Jul 12, 2025Updated 9 months ago
- ☆64Dec 30, 2024Updated last year
- [CVPR23 Highlight] CREPE: Can Vision-Language Foundation Models Reason Compositionally?☆35Apr 27, 2023Updated 3 years ago