sayakpaul / simple-image-recaptioning
Recaption large (Web)Datasets with vllm and save the artifacts.
☆30Updated last month
Related projects ⓘ
Alternatives and complementary repositories for simple-image-recaptioning
- ☆27Updated 2 weeks ago
- ☆26Updated 6 months ago
- ☆24Updated 5 months ago
- ☆27Updated 3 months ago
- faster parallel inference of mochi-1 video generation model☆73Updated this week
- ☆21Updated 5 months ago
- Official codebase for Margin-aware Preference Optimization for Aligning Diffusion Models without Reference (MaPO).☆61Updated 5 months ago
- Implementation of the proposed MaskBit from Bytedance AI☆62Updated last week
- Writing FLUX in Triton☆30Updated last month
- ☆32Updated 3 weeks ago
- WIP Pytorch code for stably training single-step, mode-dropping, deterministic autoencoders☆22Updated 6 months ago
- ☆71Updated last year
- Official Implementation of weights2weights☆121Updated last month
- ☆33Updated 6 months ago
- ☆78Updated 3 months ago
- A Gradio component that can be used to annotate images with bounding boxes.☆31Updated 3 weeks ago
- ☆40Updated this week
- Official repository for VQDM:Accurate Compression of Text-to-Image Diffusion Models via Vector Quantization paper☆29Updated 2 months ago
- Implementation of "SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing"☆84Updated 10 months ago
- Video-LlaVA fine-tune for CinePile evaluation☆38Updated 3 months ago
- A big_vision inspired repo that implements a generic Auto-Encoder class capable in representation learning and generative modeling.☆30Updated 4 months ago
- imagetokenizer is a python package, helps you encoder visuals and generate visuals token ids from codebook, supports both image and video…☆29Updated 4 months ago
- A one-stop library to standardize the inference and evaluation of all the conditional video generation models.☆43Updated 2 weeks ago
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data☆18Updated 3 months ago
- DPO, but faster 🚀☆23Updated 3 weeks ago
- ☆55Updated 3 weeks ago
- The official PyTorch implementation for Improving Long-Text Alignment for Text-to-Image Diffusion Models (LongAlign)☆57Updated last month
- Let's try and finetune the OpenAI consistency decoder to work for SDXL☆23Updated 11 months ago