tmbdev-archive / webdataset-imagenet-2
A small repository demonstrating the use of Webdataset and Imagenet
☆15Updated last year
Alternatives and similar repositories for webdataset-imagenet-2:
Users that are interested in webdataset-imagenet-2 are comparing it to the libraries listed below
- Release of ImageNet-Captions☆45Updated 2 years ago
- ☆23Updated 3 weeks ago
- ☆33Updated last year
- ☆23Updated last year
- ☆15Updated last month
- ☆40Updated last year
- Minimal multi-gpu implementation of EDM2: "Analyzing and Improving the Training Dynamics of Diffusion Models"☆27Updated 10 months ago
- This is a offical PyTorch/GPU implementation of SupMAE.☆77Updated 2 years ago
- The official implementation of Diffusion-KTO: Aligning Diffusion Models by Optimizing Human Utility☆31Updated this week
- Official implementation of "Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization"☆76Updated 9 months ago
- https://arxiv.org/abs/2209.15162☆48Updated last year
- Command-line tool for downloading and extending the RedCaps dataset.☆46Updated last year
- A Video Tokenizer Evaluation Dataset☆88Updated last week
- Code and models for the paper "The effectiveness of MAE pre-pretraining for billion-scale pretraining" https://arxiv.org/abs/2303.13496☆81Updated 5 months ago
- ElasticTok: Adaptive Tokenization for Image and Video☆46Updated 2 months ago
- An in-context conditioning version of MUSE with pre-trained checkpoints.☆111Updated last year
- JAX implementation ViT-VQGAN☆80Updated 2 years ago
- Patching open-vocabulary models by interpolating weights☆91Updated last year
- ImageNet-12k subset of ImageNet-21k (fall11)☆21Updated last year
- PyTorch implementation of "UNIT: Unifying Image and Text Recognition in One Vision Encoder", NeurlPS 2024.☆25Updated 3 months ago
- Adaptive Length Image Tokenization via Recurrent Allocation | How many tokens is an image worth ?☆94Updated 2 months ago
- Reproducible scaling laws for contrastive language-image learning (https://arxiv.org/abs/2212.07143)☆158Updated last year
- This is a PyTorch implementation of the paperViP A Differentially Private Foundation Model for Computer Vision☆36Updated last year
- Implementation for <Orthogonal Over-Parameterized Training> in CVPR'21.☆19Updated 3 years ago
- ☆49Updated last year
- Source code for "A Dense Reward View on Aligning Text-to-Image Diffusion with Preference" (ICML'24).☆35Updated 8 months ago
- Code for the paper titled "CiT Curation in Training for Effective Vision-Language Data".☆78Updated 2 years ago
- Code for T-MARS data filtering☆35Updated last year
- Official Release of NeurIPS 2023 Spotlight paper "Object-Centric Slot Diffusion"☆61Updated 10 months ago