amjadmajid / BabyTorch
BabyTorch is a minimalist deep-learning framework with a similar API to PyTorch. This minimalist design encourages learners explore and understand the underlying algorithms and mechanics of deep learning processes. It is design such that when learners are ready to switch to PyTorch they only need to remove the word `baby`.
β27Updated 10 months ago
Alternatives and similar repositories for BabyTorch:
Users that are interested in BabyTorch are comparing it to the libraries listed below
- β87Updated last year
- A repository to unravel the language of GPUs, making their kernel conversations easy to understandβ169Updated last week
- Large scale 4D parallelism pre-training for π€ transformers in Mixture of Experts *(still work in progress)*β81Updated last year
- Latent Program Network (from the "Searching Latent Program Spaces" paper)β76Updated 3 weeks ago
- Implementation of Diffusion Transformer (DiT) in JAXβ270Updated 9 months ago
- 𧱠Modula software packageβ187Updated this week
- β150Updated 7 months ago
- supporting pytorch FSDP for optimizersβ80Updated 3 months ago
- β87Updated 2 weeks ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.β103Updated 4 months ago
- β27Updated 8 months ago
- β152Updated last year
- Cost aware hyperparameter tuning algorithmβ148Updated 9 months ago
- Solve puzzles. Learn CUDA.β63Updated last year
- Custom triton kernels for training Karpathy's nanoGPT.β18Updated 5 months ago
- Serialize JAX, Flax, Haiku, or Objax model params with π€`safetensors`β44Updated 10 months ago
- A MAD laboratory to improve AI architecture designs π§ͺβ108Updated 3 months ago
- β91Updated 2 months ago
- Accelerated minigrid environments with JAXβ132Updated 8 months ago
- Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of newβ¦β120Updated 8 months ago
- A simple library for scaling up JAX programsβ134Updated 5 months ago
- seqax = sequence modeling + JAXβ151Updated 2 weeks ago
- Accelerated First Order Parallel Associative Scanβ180Updated 7 months ago
- β215Updated 8 months ago
- fast + parallel AlphaZero in JAXβ94Updated 3 months ago
- Codebase to fully reproduce the results of "No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO" (Mβ¦β25Updated 4 months ago
- Experiment of using Tangent to autodiff tritonβ78Updated last year
- Efficient optimizersβ185Updated this week
- A package for defining deep learning models using categorical algebraic expressions.β60Updated 8 months ago
- β76Updated 8 months ago