mit-han-lab / efficientvit
EfficientViT is a new family of vision models for efficient high-resolution vision.
☆1,760Updated last month
Related projects: ⓘ
- Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024☆1,332Updated 2 months ago
- [ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"☆2,255Updated 2 months ago
- [ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with Hierarchical Attention☆763Updated 3 months ago
- [CVPR 2023] Official implementation of the paper "Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segme…☆1,153Updated 8 months ago
- EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything☆2,081Updated 3 months ago
- EVA Series: Visual Representation Fantasies from BAAI☆2,209Updated last month
- RepViT: Revisiting Mobile CNN From ViT Perspective [CVPR 2024] and RepViT-SAM: Towards Real-Time Segmenting Anything☆738Updated 3 months ago
- Grounded Language-Image Pre-training☆2,154Updated 7 months ago
- [ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model☆2,812Updated last month
- detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.☆1,969Updated last month
- OpenMMLab Foundational Library for Training Deep Learning Models☆1,139Updated last week
- [CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions☆2,474Updated last month
- SAM with text prompt☆1,542Updated last month
- Code release for ConvNeXt V2 model☆1,467Updated last month
- OneFormer: One Transformer to Rule Universal Image Segmentation, arxiv 2022 / CVPR 2023☆1,440Updated 10 months ago
- [CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥…☆2,230Updated 3 weeks ago
- Hiera: A fast, powerful, and simple hierarchical vision transformer.☆857Updated 6 months ago
- Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"☆2,466Updated last month
- API for Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series☆707Updated last month
- [ICCV 2023] Tracking Anything with Decoupled Video Segmentation☆1,217Updated last month
- [ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions☆1,215Updated 6 months ago
- Fine-tune SAM (Segment Anything Model) for computer vision tasks such as semantic segmentation, matting, detection ... in specific scena…☆754Updated last year
- EfficientFormerV2 [ICCV 2023] & EfficientFormer [NeurIPs 2022]☆975Updated last year
- Official PyTorch implementation of SegFormer☆2,480Updated last month
- [ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"☆6,262Updated last month
- VMamba: Visual State Space Models,code is based on mamba☆2,028Updated last month
- Painter & SegGPT Series: Vision Foundation Models from BAAI☆2,497Updated 10 months ago
- This is a collection of our NAS and Vision Transformer work.☆1,655Updated last month
- Official Implementation of CVPR24 highligt paper: Matching Anything by Segmenting Anything☆947Updated this week
- [ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"☆2,163Updated last month