Notes about "Attention is all you need" video (https://www.youtube.com/watch?v=bCz4OMemCcA)
☆346May 28, 2023Updated 2 years ago
Alternatives and similar repositories for transformer-from-scratch-notes
Users that are interested in transformer-from-scratch-notes are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Attention is all you need implementation☆1,186Jun 8, 2024Updated last year
- BERT explained from scratch☆16Oct 26, 2023Updated 2 years ago
- LLaMA 2 implemented from scratch in PyTorch☆367Sep 25, 2023Updated 2 years ago
- Notes about LLaMA 2 model☆73Aug 30, 2023Updated 2 years ago
- Notes on the Mistral AI model☆20Dec 27, 2023Updated 2 years ago
- LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch☆126Jul 24, 2023Updated 2 years ago
- Generate large textual corpora for almost any language by crawling the web☆13Feb 17, 2024Updated 2 years ago
- Stable Diffusion implemented from scratch in PyTorch☆1,037Oct 22, 2024Updated last year
- Notes on the Mamba and the S4 model (Mamba: Linear-Time Sequence Modeling with Selective State Spaces)☆180Jan 7, 2024Updated 2 years ago
- PSSGen: Portable Test and Stimulus Standard DSL Generator☆14Dec 29, 2025Updated 2 months ago
- Notes on quantization in neural networks☆121Dec 14, 2023Updated 2 years ago
- HAM☆18Sep 19, 2021Updated 4 years ago
- The official baseline implementations for Chronocept☆10Dec 21, 2025Updated 3 months ago
- mathematica (miscellaneous)☆19Nov 1, 2025Updated 4 months ago
- Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation: https://www.youtube.com/watch?v=vAmKB7iPkWw☆596Dec 6, 2024Updated last year
- ☆15Feb 23, 2026Updated last month
- ☆12Jan 16, 2022Updated 4 years ago
- 🤖 Implementation of Self Normalizing Networks (SNN) in PyTorch.☆13Jun 19, 2017Updated 8 years ago
- ☆13Oct 5, 2025Updated 5 months ago
- Implementation of the paper "Denoising Diffusion Probabilistic Models" in PyTorch☆67Jul 4, 2023Updated 2 years ago
- Code for our paper "Decomposing The Dark Matter of Sparse Autoencoders"☆23Feb 6, 2025Updated last year
- Official PyTorch code for ICLR 2025 paper "Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Models"☆24Mar 4, 2025Updated last year
- ☆418Apr 10, 2025Updated 11 months ago
- MLX Swift implementation of Andrej Karpathy's Let's build GPT video☆63Apr 14, 2024Updated last year
- [WIP] Behold, semantic-search, built over sentence-transformers to make it easy for search engineers to evaluate, optimise and deploy mod…☆15Apr 21, 2023Updated 2 years ago
- Sophgo AI chips driver and runtime library.☆23Mar 11, 2026Updated 2 weeks ago
- Code Transformer neural network components piece by piece☆379May 1, 2023Updated 2 years ago
- Persian for LaTeX, using XeTeX☆10Feb 20, 2022Updated 4 years ago
- Fast, gpu-accelerated distance transforms☆13Mar 7, 2025Updated last year
- Building Llama 3 from scratch using PyTorch☆13Sep 1, 2024Updated last year
- Deep Learning Implementations☆17Dec 24, 2020Updated 5 years ago
- A desktop compatible version of the Defog app☆14Aug 20, 2024Updated last year
- Notes and commented code for RLHF (PPO)☆127Feb 27, 2024Updated 2 years ago
- Code of the all the data augmentation [ Based on our survey, that will soon be published ]☆10Jul 5, 2023Updated 2 years ago
- ☆32Dec 29, 2023Updated 2 years ago
- [ICLR 2024] Thin-shell Object Manipulations with Differentiable Physics Simulations☆53Jun 5, 2024Updated last year
- A simple neural network framework☆22Jan 1, 2023Updated 3 years ago
- ☆14Jan 10, 2021Updated 5 years ago
- The source code of the paper An Eigenanalysis of Angle-Based Deformation Energies☆18Sep 12, 2023Updated 2 years ago