yk / litterLinks
β71Updated 2 years ago
Alternatives and similar repositories for litter
Users that are interested in litter are comparing it to the libraries listed below
Sorting:
- JAX Implementation of Black Forest Labs' Flux.1 family of modelsβ40Updated 2 months ago
- Train vision models using JAX and π€ transformersβ100Updated last month
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"β103Updated last year
- β53Updated 2 years ago
- The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER).β¦β121Updated 2 years ago
- Ο-GPT: A New Approach to Autoregressive Modelsβ70Updated last year
- Collection of autoregressive model implementationβ85Updated 3 weeks ago
- β82Updated last year
- Implementation of the Llama architecture with RLHF + Q-learningβ170Updated last year
- Large scale 4D parallelism pre-training for π€ transformers in Mixture of Experts *(still work in progress)*β86Updated 2 years ago
- Latent Diffusion Language Modelsβ70Updated 2 years ago
- β31Updated 2 years ago
- Thispersondoesnotexist went down, so this time, while building it back up, I am going to open source all of it.β91Updated 2 years ago
- Smol but mighty language modelβ65Updated 2 years ago
- β63Updated last year
- β111Updated 6 months ago
- This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog poβ¦β92Updated 2 years ago
- Functional local implementations of main model parallelism approachesβ95Updated 2 years ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT trainingβ132Updated last year
- β62Updated 2 years ago
- Automatically take good care of your preemptible TPUsβ37Updated 2 years ago
- The Next Generation Multi-Modality Superintelligenceβ70Updated last year
- MiniHF is an inference, human preference data collection, and fine-tuning tool for local language models. It is intended to help the userβ¦β183Updated 3 months ago
- Exploring finetuning public checkpoints on filter 8K sequences on Pileβ116Updated 2 years ago
- Cerule - A Tiny Mighty Vision Modelβ68Updated 3 months ago
- β50Updated last year
- β22Updated 2 years ago
- Serialize JAX, Flax, Haiku, or Objax model params with π€`safetensors`β47Updated last year
- β68Updated last year
- Merge LLM that are split in to partsβ27Updated 6 months ago