msaroufim / mynotesLinks
☆18Updated 3 weeks ago
Alternatives and similar repositories for mynotes
Users that are interested in mynotes are comparing it to the libraries listed below
Sorting:
- ML/DL Math and Method notes☆66Updated 2 years ago
- Implementation of Flash Attention in Jax☆225Updated last year
- Proof-of-concept of global switching between numpy/jax/pytorch in a library.☆18Updated last year
- Serialize JAX, Flax, Haiku, or Objax model params with 🤗`safetensors`☆47Updated last year
- Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*☆86Updated 2 years ago
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆198Updated 8 months ago
- See https://github.com/cuda-mode/triton-index/ instead!☆11Updated last year
- This is a port of Mistral-7B model in JAX☆33Updated last year
- Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)☆190Updated 3 years ago
- ☆17Updated 2 years ago
- ☆68Updated 10 months ago
- a Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization in pure C.☆23Updated last year
- Write a fast kernel and run it on Discord. See how you compare against the best!☆68Updated last week
- Experiment of using Tangent to autodiff triton☆82Updated 2 years ago
- A Jax-based library for building transformers, includes implementations of GPT, Gemma, LlaMa, Mixtral, Whisper, SWin, ViT and more.☆298Updated last year
- some common Huggingface transformers in maximal update parametrization (µP)☆87Updated 3 years ago
- Small scale distributed training of sequential deep learning models, built on Numpy and MPI.☆155Updated 2 years ago
- Highly commented implementations of Transformers in PyTorch☆138Updated 2 years ago
- An implementation of the Llama architecture, to instruct and delight☆21Updated 8 months ago
- ☆92Updated last year
- Context Manager to profile the forward and backward times of PyTorch's nn.Module☆83Updated 2 years ago
- Functional local implementations of main model parallelism approaches☆95Updated 2 years ago
- Pre-train BERT from scratch, with HuggingFace. Accompanies the blog post: sidsite.com/posts/bert-from-scratch☆43Updated 8 months ago
- This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog po…☆92Updated 2 years ago
- JAX implementation of the Mistral 7b v0.2 model☆35Updated last year
- Notebooks for the "Deep Learning with JAX" book☆168Updated 8 months ago
- ☆129Updated last year
- ☆22Updated last year
- Resources from the EleutherAI Math Reading Group☆54Updated 11 months ago
- MinT: Minimal Transformer Library and Tutorials☆260Updated 3 years ago