CVHub520 / efficientvit
EfficientViT is a new family of vision models for efficient high-resolution vision.
☆24Updated last year
Alternatives and similar repositories for efficientvit:
Users that are interested in efficientvit are comparing it to the libraries listed below
- Python scripts performing Open Vocabulary Object Detection using the YOLO-World model in ONNX.☆49Updated 11 months ago
- Official Pytorch Implementation of Self-emerging Token Labeling☆32Updated 11 months ago
- Code of paper "A new baseline for edge detection: Make Encoder-Decoder great again"☆37Updated last month
- ☆13Updated 3 years ago
- Codebase for the Recognize Anything Model (RAM)☆75Updated last year
- [NeurIPS2022] This is the official implementation of the paper "Expediting Large-Scale Vision Transformer for Dense Prediction without Fi…☆83Updated last year
- SAM-CLIP module for use with Autodistill.☆14Updated last year
- ☆32Updated last year
- Timm model explorer☆37Updated 11 months ago
- ☆33Updated last year
- EfficientSAM + YOLO World base model for use with Autodistill.☆10Updated last year
- ONNX-compatible DocShadow: High-Resolution Document Shadow Removal. Supports TensorRT 🚀☆21Updated last year
- Official Training and Inference Code of Amodal Expander, Proposed in Tracking Any Object Amodally☆15Updated 8 months ago
- Auto Segmentation label generation with SAM (Segment Anything) + Grounding DINO☆19Updated last month
- ☆25Updated 4 months ago
- [ICCV2023] TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance☆85Updated 8 months ago
- Zero-label image classification via OpenCLIP knowledge distillation☆121Updated last year
- This repository is for the first survey on SAM for videos.☆35Updated 2 weeks ago
- HunyuanDiT with TensorRT and libtorch☆17Updated 10 months ago
- Code release for the CVPR'23 paper titled "PartDistillation Learning part from Instance Segmentation"☆58Updated last year
- [PR 2024] A large Cross-Modal Video Retrieval Dataset with Reading Comprehension☆25Updated last year
- Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"☆101Updated 6 months ago
- Estimate dataset difficulty and detect label mistakes using reconstruction error ratios!☆23Updated 2 months ago
- (ICLR 2024, CVPR 2024) SparseFormer☆73Updated 4 months ago
- This is a warehouse for semantic segmentation models, can be used to train your image-datasets for segmentation tasks.☆12Updated last month
- Rethinking Interactive Image Segmentation with Low Latency, High Quality, and Diverse Prompts (CVPR 2024)☆74Updated 5 months ago