amjadmajid / BabyTorchLinks
BabyTorch is a minimalist deep-learning framework with a similar API to PyTorch. This minimalist design encourages learners explore and understand the underlying algorithms and mechanics of deep learning processes. It is design such that when learners are ready to switch to PyTorch they only need to remove the word `baby`.
☆26Updated last month
Alternatives and similar repositories for BabyTorch
Users that are interested in BabyTorch are comparing it to the libraries listed below
Sorting:
- Cost aware hyperparameter tuning algorithm☆163Updated last year
- ☆274Updated last year
- ☆136Updated last week
- Implementation of Diffusion Transformer (DiT) in JAX☆279Updated last year
- Home for "How To Scale Your Model", a short blog-style textbook about scaling LLMs on TPUs☆435Updated this week
- ☆114Updated last month
- Solve puzzles. Learn CUDA.☆64Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆149Updated 3 weeks ago
- A simple library for scaling up JAX programs☆139Updated 8 months ago
- seqax = sequence modeling + JAX☆165Updated this week
- Minimal but scalable implementation of large language models in JAX☆35Updated last week
- Jax/Flax rewrite of Karpathy's nanoGPT☆59Updated 2 years ago
- ☆203Updated 5 months ago
- 🧱 Modula software package☆209Updated 3 months ago
- Accelerate, Optimize performance with streamlined training and serving options with JAX.☆292Updated this week
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆625Updated this week
- Distributed pretraining of large language models (LLMs) on cloud TPU slices, with Jax and Equinox.☆24Updated 9 months ago
- Official JAX implementation of xLSTM including fast and efficient training and inference code. 7B model available at https://huggingface.…☆97Updated 6 months ago
- LoRA for arbitrary JAX models and functions☆140Updated last year
- Pytorch implementation of Evolutionary Policy Optimization, from Wang et al. of the Robotics Institute at Carnegie Mellon University☆97Updated 2 weeks ago
- Efficient optimizers☆249Updated last week
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆188Updated last month
- Simple, minimal implementation of the Mamba SSM in one pytorch file. Using logcumsumexp (Heisen sequence).☆120Updated 9 months ago
- CIFAR-10 speedruns: 94% in 2.6 seconds and 96% in 27 seconds☆266Updated last week
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆92Updated 4 months ago
- ☆162Updated last year
- JAX-Toolbox☆324Updated this week
- ☆150Updated 11 months ago
- For optimization algorithm research and development.☆522Updated last week
- Maximal Update Parametrization (μP) with Flax & Optax.☆16Updated last year