LAION-AI/laion50BU

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/LAION-AI/laion50BU)

LAION-AI / laion50BU

Un-*** 50 billions multimodality dataset

☆24

Alternatives and similar repositories for laion50BU

Users that are interested in laion50BU are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

LAION-AI / scaling-laws-for-comparison
View on GitHub
☆22May 12, 2026Updated 2 months ago
TheoCoombes / crawlingathome
View on GitHub
A client library for LAION's effort to filter CommonCrawl with CLIP, building a large scale image-text dataset.
☆33Mar 21, 2023Updated 3 years ago
pbaylies / clustering-laion400m
View on GitHub
Script and models for clustering LAION-400m CLIP embeddings.
☆26Jan 10, 2022Updated 4 years ago
LAION-AI / General-GPT
View on GitHub
☆65Oct 4, 2023Updated 2 years ago
LAION-AI / LAION-PEOPLE
View on GitHub
This project provides a data set with bounding boxes, body poses, 3D face meshes & captions of people from our LAION-2.2B. Additionally i…
☆14Jan 2, 2022Updated 4 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
AvrahamRaviv / Text2All
View on GitHub
A curated list of text-guided generative models resources
☆159Nov 2, 2022Updated 3 years ago
pbaylies / Augmented_CLIP
View on GitHub
Training simple models to predict CLIP image embeddings from text embeddings, and vice versa.
☆60Mar 31, 2022Updated 4 years ago
OpenEuroLLM / oellm-eval
View on GitHub
A CLI to run reproducible evaluation across EuroHPC clusters
☆18Jul 9, 2026Updated 2 weeks ago
encounter1997 / Arxiv-Daily
View on GitHub
My daily arxiv reading note
☆30Nov 10, 2021Updated 4 years ago
SLAMPAI / large-scale-pretraining-transfer
View on GitHub
Code for reproducing the experiments on large-scale pre-training and transfer learning for the paper "Effect of large-scale pre-training …
☆19May 29, 2022Updated 4 years ago
facebookresearch / dmae_st
View on GitHub
Directed masked autoencoders
☆14Mar 25, 2026Updated 3 months ago
afiaka87 / dalle-pytorch-datasets
View on GitHub
☆12Jun 14, 2021Updated 5 years ago
kingoflolz / cc_img_dl
View on GitHub
☆27Mar 13, 2021Updated 5 years ago
iejMac / video2numpy
View on GitHub
Optimized library for large-scale extraction of frames and audio from video.
☆203Sep 11, 2023Updated 2 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
MineDojo / egl-docker
View on GitHub
Xfce Desktop container designed for direct access to the GPU with EGL using VirtualGL for GPUs. Does not require /tmp/.X11-unix host sock…
☆10Jul 25, 2022Updated 3 years ago
rom1504 / cc2dataset
View on GitHub
Easily convert common crawl to a dataset of caption and document. Image/text Audio/text Video/text, ...
☆321Dec 9, 2023Updated 2 years ago
christophschuhmann / 4MC-4M-Image-Text-Pairs-with-CLIP-embeddings
View on GitHub
I have created a dataset of Image-Text-Pairs by using the cosine similarity of the CLIP embeddings of the image & it's caption derrived f…
☆17Apr 22, 2021Updated 5 years ago
lucidrains / CLAP
View on GitHub
Contrastive Language-Audio Pretraining
☆15May 18, 2021Updated 5 years ago
iejMac / clip-video-encode
View on GitHub
Easily compute clip embeddings from video frames
☆149Oct 31, 2023Updated 2 years ago
TencentARC / pi-Tuning
View on GitHub
Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.
☆33Jul 21, 2023Updated 3 years ago
rvencu / crawlingathome-gpu-hcloud
View on GitHub
GPU controlled Hetzner Cloud workers swarm for Crawling@Home project
☆58Oct 9, 2022Updated 3 years ago
rom1504 / python-template
View on GitHub
Simple python template
☆44Apr 25, 2024Updated 2 years ago
w3c / mediacapture-handle
View on GitHub
☆15Mar 6, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
mlfoundations / datacomp
View on GitHub
DataComp: In search of the next generation of multimodal datasets
☆787Apr 28, 2025Updated last year
salesforce / MUST
View on GitHub
PyTorch code for MUST
☆108May 1, 2025Updated last year
technobird22 / NeoGen
View on GitHub
A tool for generating awesome AI art
☆17Jul 29, 2022Updated 3 years ago
cat-state / clip_benchmark
View on GitHub
clip retrieval benchmark
☆17May 4, 2022Updated 4 years ago
Caoang327 / vis_det
View on GitHub
Code of "Visualizing and Understanding Object Detecor"
☆20Jun 24, 2021Updated 5 years ago
SenZHANG-GitHub / InfoOdometry
View on GitHub
[IJCV 2022] Information-Theoretic Odometry Learning
☆16Apr 19, 2023Updated 3 years ago
patil-suraj / vit-vqgan
View on GitHub
JAX implementation ViT-VQGAN
☆82Sep 21, 2022Updated 3 years ago
afiaka87 / laionide
View on GitHub
checkpoints for glide finetuned on laion and other datasets. wip.
☆50Aug 17, 2022Updated 3 years ago
pcuenca / lpips-j
View on GitHub
Minimal JAX/Flax port of `lpips` supporting `vgg16`, with pre-trained weights stored in the 🤗 Hugging Face hub.
☆17Aug 1, 2022Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
naver-ai / imagenet-annotation-tool
View on GitHub
☆17Jul 24, 2023Updated 2 years ago
crowsonkb / cloob-training
View on GitHub
CLOOB training (JAX) and inference (JAX and PyTorch)
☆76May 16, 2022Updated 4 years ago
mlfoundations / Gelato
View on GitHub
🍨 Gelato — From Data Curation to Reinforcement Learning: Building a Strong Grounding Model for Computer-Use Agents
☆46Dec 22, 2025Updated 7 months ago
kakaobrain / coyo-dataset
View on GitHub
COYO-700M: Large-scale Image-Text Pair Dataset
☆1,256Nov 30, 2022Updated 3 years ago
lucidrains / discrete-key-value-bottleneck-pytorch
View on GitHub
Implementation of Discrete Key / Value Bottleneck, in Pytorch
☆88Jul 9, 2023Updated 3 years ago
Jack000 / glid-3
View on GitHub
combination of OpenAI GLIDE and Latent Diffusion
☆136Apr 7, 2022Updated 4 years ago
serre-lab / CVR
View on GitHub
A Benchmark for Efficient and Compositional Visual Reasoning
☆25Aug 2, 2023Updated 2 years ago