rom1504 / embedding-readerLinks
Efficiently read embedding in streaming from any filesystem
☆104Updated 4 months ago
Alternatives and similar repositories for embedding-reader
Users that are interested in embedding-reader are comparing it to the libraries listed below
Sorting:
- Get hundred of million of image+url from the crawling at home dataset and preprocess them☆223Updated last year
- Simple python template☆42Updated last year
- Using pretrained encoder and language models to generate captions from multimedia inputs.☆98Updated 2 years ago
- CLOOB training (JAX) and inference (JAX and PyTorch)☆74Updated 3 years ago
- Simple large-scale training of stable diffusion with multi-node support.☆133Updated 2 years ago
- ☆65Updated 2 years ago
- ☆103Updated last year
- Easily compute clip embeddings from video frames☆147Updated 2 years ago
- ☆112Updated 4 years ago
- JAX implementation ViT-VQGAN☆82Updated 3 years ago
- Aim for the moon. If you miss, you may hit a star.☆164Updated 2 years ago
- A repository containing datasets and tools to train a watermark classifier.☆74Updated 3 years ago
- Easily convert common crawl to a dataset of caption and document. Image/text Audio/text Video/text, ...☆322Updated 2 years ago
- ☆160Updated 3 years ago
- Jupyter Notebooks for experimenting with negative prompting with Stable Diffusion 2.0.☆87Updated 3 years ago
- Load any clip model with a standardized interface☆22Updated 2 months ago
- Finetune glide-text2im from openai on your own data.☆88Updated 2 months ago
- Aggregating embeddings over time☆32Updated 2 years ago
- Diffusion-based markup-to-image generation☆83Updated 2 years ago
- Implementation of the video diffusion model and training scheme presented in the paper, Flexible Diffusion Modeling of Long Videos, in Py…☆85Updated 3 years ago
- Finetune the 1.4B latent diffusion text2img-large checkpoint from CompVis using deepspeed. (work-in-progress)☆36Updated 3 years ago
- Training simple models to predict CLIP image embeddings from text embeddings, and vice versa.☆60Updated 3 years ago
- Implementation of the deepmind Flamingo vision-language model, based on Hugging Face language models and ready for training☆168Updated 2 years ago
- Let's make a video clip☆96Updated 3 years ago
- Official implementation of "Active Image Indexing"☆60Updated 2 years ago
- codebase for the SIMAT dataset and evaluation☆38Updated 3 years ago
- Description and pointers of laion datasets☆248Updated 3 years ago
- An open source implementation of CLIP.☆33Updated 3 years ago
- Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch☆103Updated 2 years ago
- Script and models for clustering LAION-400m CLIP embeddings.☆26Updated 3 years ago