apple / ml-fastvit
This repository contains the official implementation of the research paper, "FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization" ICCV 2023
☆1,864Updated last year
Alternatives and similar repositories for ml-fastvit:
Users that are interested in ml-fastvit are comparing it to the libraries listed below
- Hiera: A fast, powerful, and simple hierarchical vision transformer.☆951Updated 11 months ago
- This repository contains the official implementation of the research paper, "MobileCLIP: Fast Image-Text Models through Multi-Modal Reinf…☆828Updated 2 months ago
- CVNets: A library for training computer vision networks☆1,824Updated last year
- Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.☆2,606Updated this week
- [ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with Hierarchical Attention☆822Updated 8 months ago
- Segment Anything in High Quality [NeurIPS 2023]☆3,806Updated 2 months ago
- [NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"☆4,494Updated 6 months ago
- PyTorch code and models for the DINOv2 self-supervised learning method.☆9,827Updated 6 months ago
- [ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model☆3,220Updated last week
- EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything☆2,264Updated last month
- Painter & SegGPT Series: Vision Foundation Models from BAAI☆2,552Updated 2 months ago
- Scenic: A Jax Library for Computer Vision Research and Beyond☆3,434Updated 3 weeks ago
- ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Expert…☆1,352Updated 2 months ago
- [CVPR 2023] Official Implementation of X-Decoder for generalized decoding for pixel, image and language☆1,303Updated last year
- Official codebase for I-JEPA, the Image-based Joint-Embedding Predictive Architecture. First outlined in the CVPR paper, "Self-supervised…☆2,914Updated 9 months ago
- Automated dense category annotation engine that serves as the initial semantic labeling for the Segment Anything dataset (SA-1B).☆2,189Updated last year
- Efficient vision foundation models for high-resolution generation and perception.☆2,646Updated 3 weeks ago
- [ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"☆2,482Updated 7 months ago
- A method to increase the speed and lower the memory footprint of existing vision transformers.☆1,012Updated 8 months ago
- Code release for ConvNeXt V2 model☆1,614Updated 6 months ago
- EVA Series: Visual Representation Fantasies from BAAI☆2,420Updated 6 months ago
- PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation☆5,031Updated 6 months ago
- Foundation Architecture for (M)LLMs☆3,048Updated 10 months ago
- LAVIS - A One-stop Library for Language-Vision Intelligence☆10,261Updated 3 months ago
- Fast Segment Anything☆7,704Updated 6 months ago
- Meta-Transformer for Unified Multimodal Learning☆1,566Updated last year
- Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024☆1,456Updated 7 months ago
- [ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"☆7,427Updated 6 months ago
- [CVPR 2023] OneFormer: One Transformer to Rule Universal Image Segmentation☆1,559Updated 4 months ago
- This repository contains the official implementation of the research paper, "An Improved One millisecond Mobile Backbone".☆750Updated 2 years ago