CVHub520 / efficientvit
EfficientViT is a new family of vision models for efficient high-resolution vision.
☆22Updated last year
Related projects ⓘ
Alternatives and complementary repositories for efficientvit
- ☆33Updated 10 months ago
- This repo contains extensions to DINO V2 model by Meta, and awesome applications built on top of it.☆38Updated last year
- Official Pytorch Implementation of Self-emerging Token Labeling☆30Updated 7 months ago
- MobileSAM already integrated into Personalize Segment Anything Model (SAM) with 1 shot in 10 seconds☆34Updated last year
- [NeurIPS2022] This is the official implementation of the paper "Expediting Large-Scale Vision Transformer for Dense Prediction without Fi…☆82Updated last year
- Official Training and Inference Code of Amodal Expander, Proposed in Tracking Any Object Amodally☆14Updated 4 months ago
- [ICCV2023] TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance☆66Updated 4 months ago
- Python scripts performing Open Vocabulary Object Detection using the YOLO-World model in ONNX.☆41Updated 7 months ago
- ☆62Updated 11 months ago
- Auto Segmentation label generation with SAM (Segment Anything) + Grounding DINO☆15Updated last year
- Code release for the CVPR'23 paper titled "PartDistillation Learning part from Instance Segmentation"☆59Updated 11 months ago
- Image/Instance Retrieval using CLIP, A self supervised Learning Model☆22Updated last year
- Rethinking Interactive Image Segmentation with Low Latency, High Quality, and Diverse Prompts (CVPR 2024)☆65Updated last month
- Baby-DALL3: Annotation anything in visual tasks and Generate anything just all in one-pipeline with GPT-4 (a small baby of DALL·E 3).☆82Updated last year
- This repository is for the first survey on SAM for videos.☆19Updated this week
- Distilling the powerful segment anything models into lightweight ones for efficient segmentation.☆29Updated last year
- Stable Diffusion in TensorRT 8.5+☆14Updated last year
- SAM-CLIP module for use with Autodistill.☆12Updated last year
- An interactive demo based on Segment-Anything for stroke-based painting which enables human-like painting.☆34Updated last year
- A practice for million-scale multi-domain universal object detection☆22Updated 5 months ago
- Codes for ICML 2023 Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation☆35Updated last year
- [PR 2024] A large Cross-Modal Video Retrieval Dataset with Reading Comprehension☆22Updated 10 months ago
- ☆23Updated last month
- Vision-oriented multimodal AI☆49Updated 5 months ago
- Implementation of ViTaR: ViTAR: Vision Transformer with Any Resolution in PyTorch☆26Updated last week
- CLIP and SigLIP models optimized with TensorRT with a Transformers-like API☆15Updated last month
- ViT trained on COYO-Labeled-300M dataset☆29Updated last year
- ☆52Updated last year
- ☆30Updated last year