Photoroom / datagoLinks
A natively parallel dataloader for Python, written in Rust. Serving data at GB/s speeds, while covering aspect ratio bucketing, crop and resize for image ML workloads.
☆123Updated this week
Alternatives and similar repositories for datago
Users that are interested in datago are comparing it to the libraries listed below
Sorting:
- Framework based on a vector dabase to store, manage and curate large image datasets☆81Updated 4 months ago
- ☆92Updated last year
- Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.☆160Updated last year
- An implementation of PSGD Kron second-order optimizer for PyTorch☆97Updated 5 months ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆132Updated last year
- supporting pytorch FSDP for optimizers☆84Updated last year
- Lightweight package that tracks and summarizes code changes using LLMs (Large Language Models)☆34Updated 10 months ago
- Scalable and Performant Data Loading☆356Updated this week
- WIP☆93Updated last year