robvanvolt / DALLE-tools
DALLE-tools provided useful dataset utilities to improve you workflow with WebDatasets.
☆15Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for DALLE-tools
- Describe the format of image/text datasets☆11Updated 2 years ago
- Implementation of a holodeck, written in Pytorch☆17Updated last year
- ☆16Updated 2 years ago
- Unified API to facilitate usage of pre-trained "perceptor" models, a la CLIP☆39Updated last year
- DiCE: The Infinitely Differentiable Monte-Carlo Estimator☆30Updated last year
- Colab notebook to finetune GLIDE.☆13Updated 2 years ago
- Generate images from texts. In Russian☆19Updated 2 years ago
- Script and models for clustering LAION-400m CLIP embeddings.☆25Updated 2 years ago
- CHARacter-awaRE Diffusion: Multilingual Character-Aware Encoders for Font-Aware Diffusers That Can Actually Spell☆14Updated last year
- Simple script to re-rank images using OpenAI's CLIP https://github.com/openai/CLIP.☆16Updated 3 years ago
- ☆17Updated 9 months ago
- Floral Diffusion is a custom diffusion model trained by jags using a DD 5.6 version☆26Updated 2 years ago
- Load any clip model with a standardized interface☆21Updated 6 months ago
- Contains my experiments with the `big_vision` repo to train ViTs on ImageNet-1k.☆22Updated last year
- A CLIP conditioned Decision Transformer.☆22Updated 3 years ago
- Utilities for PyTorch distributed☆23Updated last year
- Finetune the 1.4B latent diffusion text2img-large checkpoint from CompVis using deepspeed. (work-in-progress)☆36Updated 2 years ago
- A scalable implementation of diffusion and flow-matching with XGBoost models, applied to calorimeter data.☆17Updated 2 weeks ago
- PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"☆23Updated last week
- Visionner turn raw image data into numpy array, more suitable for deep learning task☆10Updated last year
- ☆21Updated 3 years ago
- I have created a dataset of Image-Text-Pairs by using the cosine similarity of the CLIP embeddings of the image & it's caption derrived f…☆15Updated 3 years ago
- Training simple models to predict CLIP image embeddings from text embeddings, and vice versa.☆59Updated 2 years ago
- A minimal TPU compatible Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis.☆13Updated 2 years ago
- GPT-jax based on the official huggingface library☆13Updated 3 years ago