robvanvolt / DALLE-tools
DALLE-tools provided useful dataset utilities to improve you workflow with WebDatasets.
☆15Updated 2 years ago
Alternatives and similar repositories for DALLE-tools:
Users that are interested in DALLE-tools are comparing it to the libraries listed below
- Describe the format of image/text datasets☆11Updated 2 years ago
- ☆15Updated 2 years ago
- Implementation of a holodeck, written in Pytorch☆17Updated last year
- Load any clip model with a standardized interface☆21Updated 9 months ago
- Simple script to re-rank images using OpenAI's CLIP https://github.com/openai/CLIP.☆15Updated 3 years ago
- Script and models for clustering LAION-400m CLIP embeddings.☆25Updated 3 years ago
- Generate images from texts. In Russian☆19Updated 3 years ago
- ☆20Updated 3 years ago
- Unified API to facilitate usage of pre-trained "perceptor" models, a la CLIP☆39Updated 2 years ago
- I have created a dataset of Image-Text-Pairs by using the cosine similarity of the CLIP embeddings of the image & it's caption derrived f…☆15Updated 3 years ago
- DiCE: The Infinitely Differentiable Monte-Carlo Estimator☆31Updated last year
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data☆21Updated 6 months ago
- A minimal TPU compatible Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis.☆13Updated 2 years ago
- Contains my experiments with the `big_vision` repo to train ViTs on ImageNet-1k.☆22Updated 2 years ago
- Implementation of Metaformer, but in an autoregressive manner☆23Updated 2 years ago
- ☆17Updated 11 months ago
- Colab notebook to finetune GLIDE.☆13Updated 2 years ago
- ☆13Updated 3 years ago
- Visionner turn raw image data into numpy array, more suitable for deep learning task☆10Updated last year
- Visual search interface☆11Updated 3 years ago
- Floral Diffusion is a custom diffusion model trained by jags using a DD 5.6 version☆26Updated 2 years ago
- Shows how to do parameter ensembling using differential evolution.☆10Updated 3 years ago
- ☆30Updated 3 years ago
- Finetune the 1.4B latent diffusion text2img-large checkpoint from CompVis using deepspeed. (work-in-progress)☆36Updated 2 years ago
- PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"☆23Updated this week
- ☆28Updated 3 years ago
- This repository hosts the code to port NumPy model weights of BiT-ResNets to TensorFlow SavedModel format.☆14Updated 3 years ago
- Utilities for PyTorch distributed☆23Updated last year
- Training simple models to predict CLIP image embeddings from text embeddings, and vice versa.☆60Updated 2 years ago
- The original weights of some Caffe models, ported to PyTorch.☆11Updated 3 years ago