Jaykef / min-patchnizerLinks
Minimal, clean code for video/image "patchnization" - a process commonly used in tokenizing visual data for use in a Transformer encoder.
☆11Updated last year
Alternatives and similar repositories for min-patchnizer
Users that are interested in min-patchnizer are comparing it to the libraries listed below
Sorting:
- PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"☆24Updated 2 weeks ago
- Floral Diffusion is a custom diffusion model trained by jags using a DD 5.6 version☆26Updated 2 years ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆19Updated 2 years ago
- ☆20Updated 4 months ago
- GET3D online data renderer☆11Updated 2 years ago
- LoRA fine-tuned Stable Diffusion Deployment☆31Updated 2 years ago
- implementation of https://arxiv.org/pdf/2312.09299☆21Updated last year
- Load any clip model with a standardized interface☆21Updated last year
- Digital daydreaming with CLIP Interrogator and Diffusers☆13Updated last month
- A fast approach for translating a series of text prompts into a video. The 2022 NeurIPS Workshop on Machine Learning for Creativity and D…☆32Updated 2 years ago
- ☆1Updated last year
- Colab notebook to finetune GLIDE.☆13Updated 3 years ago
- ☆11Updated last year
- The implementation of "Leeroo Orchestrator: Elevating LLMs Performance Through Model Integration"☆56Updated last year
- Official Code for MIMETIC^2☆12Updated 7 months ago
- Rust bindings for CTranslate2☆14Updated 2 years ago
- ☆20Updated 3 years ago
- Official repository for the paper "End-to-End Visual Editing with a Generatively Pre-Trained Artist", which is accepted at ECCV 2022. Her…☆29Updated 2 years ago
- ☆12Updated 2 months ago
- Latent Diffusion Language Models☆68Updated last year
- Implementation of a holodeck, written in Pytorch☆18Updated last year
- ☆23Updated 7 months ago
- DiCE: The Infinitely Differentiable Monte-Carlo Estimator☆31Updated last year
- ☆32Updated 2 years ago
- Implementation of SoundtStream from the paper: "SoundStream: An End-to-End Neural Audio Codec"☆12Updated 5 months ago
- Describe the format of image/text datasets☆11Updated 3 years ago
- Guide diffusion on ImageBind embedding similarity☆29Updated 2 years ago
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data☆21Updated 11 months ago
- Multimodal Open Source Framework for Conversational Agent Research and Development.☆19Updated 5 months ago
- Simple script to re-rank images using OpenAI's CLIP https://github.com/openai/CLIP.☆16Updated 4 years ago