Well documented, unit tested, type checked and formatted implementation of a vanilla transformer - for educational purposes.
☆287Mar 27, 2026Updated last month
Alternatives and similar repositories for transformer-from-scratch
Users that are interested in transformer-from-scratch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Simple transformer implementation from scratch in pytorch. (archival, latest version on codeberg)☆1,097Mar 20, 2025Updated last year
- Overview of corpora/datasets for Germanic low-resource languages and dialects. Accompanies "A Survey of Corpora for Germanic Low-Resource…☆27Feb 16, 2026Updated 2 months ago
- The code for the video tutorial series on building a Transformer from scratch: https://www.youtube.com/watch?v=XR4VDnJzB8o☆19Apr 15, 2023Updated 3 years ago
- Distributed Communication-Optimal Shuffle and Transpose Algorithm☆14Apr 18, 2026Updated last week
- Transformer implementation from scratch (in PyTorch)☆20Jun 17, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code and data for the paper "Disentangling Uncertainty in Machine Translation Evaluation", accepted at EMNLP 2022.☆23Jun 23, 2023Updated 2 years ago
- How certain is your transformer?☆25Apr 25, 2021Updated 5 years ago
- CSC Training: High-Level GPU Programming☆14Oct 16, 2025Updated 6 months ago
- Code for Neural Estimation of the Rate-Distortion Function With Applications to Operational Source Coding☆14Nov 2, 2023Updated 2 years ago
- Teaching Models to Express Their Uncertainty in Words☆38May 26, 2022Updated 3 years ago
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆32Apr 2, 2025Updated last year
- Code for "Theoretical Foundations of Deep Selective State-Space Models" (NeurIPS 2024)☆16Jan 7, 2025Updated last year
- ☆10Sep 13, 2021Updated 4 years ago
- A lightweight but powerful library to build token indices for NLP tasks, compatible with major Deep Learning frameworks like PyTorch and …☆50Dec 6, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆22May 6, 2020Updated 5 years ago
- Repository for DEMETR: Diagnosing Evaluation Metrics for Translation☆17Nov 29, 2022Updated 3 years ago
- Laplace Redux -- Effortless Bayesian Deep Learning☆45Jun 6, 2025Updated 10 months ago
- A dataset of alignment research and code to reproduce it☆78Jun 22, 2023Updated 2 years ago
- Pytorch implementation of regularization methods for deep networks obtained via kernel methods.☆22Dec 27, 2019Updated 6 years ago
- CausalNLP is a practical toolkit for causal inference with text as treatment, outcome, or "controlled-for" variable.☆157Feb 6, 2025Updated last year
- Model zoo for different kinds of uncertainty quantification methods used in Natural Language Processing, implemented in PyTorch.☆55May 5, 2023Updated 2 years ago
- [NAACL 2022] TIE: Topological Information Enhanced Structural Reading Comprehension on Web Pages☆22Jun 3, 2022Updated 3 years ago
- Code for ICML 2025 paper | Joint Localization and Activation Editing for Low-Resource Fine-Tuning☆28Jun 18, 2025Updated 10 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆13Feb 7, 2023Updated 3 years ago
- Introduction to Generative Adversarial Networks☆22Oct 22, 2020Updated 5 years ago
- ☆10Oct 27, 2020Updated 5 years ago
- Cross-modal Coherence Modeling for Caption Generation☆11Jul 24, 2020Updated 5 years ago
- Protein Sequence Evolutionary Information Language Model☆14Oct 5, 2023Updated 2 years ago
- A repository for the EMNLP 2021 paper "Is Information Density Uniform in Task-Oriented Dialogues?" and for the CoNLL 2021 paper "Analysin…☆10Jun 17, 2024Updated last year
- Dataset used to evaluate Skill Extraction systems based on the ESCO skills taxonomy.☆17Jul 18, 2024Updated last year
- NLP course @ CS Faculty, HSE☆15Mar 4, 2020Updated 6 years ago
- ☆13Mar 22, 2023Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- CrossRE: A Cross-Domain Dataset for Relation Extraction (Findings of EMNLP 2022)☆50Aug 20, 2024Updated last year
- Imshow - Flexible and Customizable Image Display with Python☆14Dec 27, 2025Updated 4 months ago
- ☆10Oct 4, 2022Updated 3 years ago
- Code files for my medium blog☆17Oct 20, 2020Updated 5 years ago
- A Python toolkit to compute molecular features and predict activities and properties of small molecules☆21Jan 28, 2022Updated 4 years ago
- Deep image generation is becoming a tool to enhance artists and designers creativity potential. In this paper, we aim at making the gener…☆13Aug 18, 2020Updated 5 years ago
- 2D Fused LASSO using Gradient Descent for grayscale image restoration 🎈☆10Jan 24, 2019Updated 7 years ago