☆454Dec 18, 2025Updated 2 months ago
Alternatives and similar repositories for mnist-cuda
Users that are interested in mnist-cuda are comparing it to the libraries listed below
Sorting:
- ☆3,316Feb 7, 2026Updated 3 weeks ago
- Apply GPU in ML and DL☆56Updated this week
- Fast CUDA matrix multiplication from scratch☆1,071Sep 2, 2025Updated 6 months ago
- Step by step implementation of a fast softmax kernel in CUDA☆62Jan 6, 2025Updated last year
- Learnings and programs related to CUDA☆433Jun 29, 2025Updated 8 months ago
- 100 days of building GPU kernels!☆573Apr 27, 2025Updated 10 months ago
- (WIP) A small but powerful, homemade PyTorch from scratch.☆676Feb 24, 2026Updated last week
- Material for gpu-mode lectures☆5,800Feb 1, 2026Updated last month
- A 120-day CUDA learning plan covering daily concepts, exercises, pitfalls, and references (including “Programming Massively Parallel Proc…☆869Mar 29, 2025Updated 11 months ago
- Bunch of notebooks for pre-training custom Saiga-like LLM☆12Feb 9, 2024Updated 2 years ago
- Learning about CUDA by writing PTX code.☆153Feb 27, 2024Updated 2 years ago
- Atom package that integrates crystal tools☆10Sep 5, 2015Updated 10 years ago
- ☆120Dec 9, 2025Updated 2 months ago
- machine learning from absolute scratch in c. gradients, linear algebra ops & everything else without using any third party library!☆26Aug 3, 2024Updated last year
- Code from the "CUDA Crash Course" YouTube series by CoffeeBeforeArch☆941Jul 19, 2023Updated 2 years ago
- This repository is a curated collection of resources, tutorials, and practical examples designed to guide you through the journey of mast…☆440Feb 22, 2025Updated last year
- High performance Rust API for KDB+☆13Jun 27, 2021Updated 4 years ago
- Machine Learning for Computer Systems☆17Dec 16, 2025Updated 2 months ago
- My Submission for the OpenAI/NeurIPS ProcGen Competition☆11Nov 12, 2020Updated 5 years ago
- GAIL learning to imitate PPO playing CartPole.☆12May 27, 2021Updated 4 years ago
- ☆91Feb 29, 2024Updated 2 years ago
- GPU programming related news and material links☆2,010Sep 17, 2025Updated 5 months ago
- A package for estimating and regularising correlation and covariance matrices with high frequency financial data☆14Feb 4, 2026Updated last month
- Personal solutions to the Triton Puzzles☆20Jul 18, 2024Updated last year
- DEX Cyclic Arbitrage Analysis☆15Jan 14, 2022Updated 4 years ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆16Oct 3, 2023Updated 2 years ago
- Qwen3-0.6B megakernel: 527 tok/s decode on RTX 3090 (3.8x faster than PyTorch)☆81Feb 10, 2026Updated 3 weeks ago
- A high-performance software package for training and evaluating machine-learned XC functionals using the CIDER framework☆18Dec 14, 2025Updated 2 months ago
- The educational course dedicated to FOSS culture and toolchain☆20Aug 22, 2025Updated 6 months ago
- Small scale distributed training of sequential deep learning models, built on Numpy and MPI.☆162Oct 19, 2023Updated 2 years ago
- Samples for CUDA Developers which demonstrates features in CUDA Toolkit☆8,899Jan 6, 2026Updated last month
- ☆90Nov 11, 2025Updated 3 months ago
- The official Github Repo and Download for the FNAF Mod☆10Nov 10, 2015Updated 10 years ago
- CUDA Matrix Multiplication Optimization☆261Jul 19, 2024Updated last year
- ☆417Apr 10, 2025Updated 10 months ago
- ☆22Aug 26, 2024Updated last year
- making the official triton tutorials actually comprehensible☆123Aug 25, 2025Updated 6 months ago
- Complete solutions to the Programming Massively Parallel Processors Edition 4☆676Jun 18, 2025Updated 8 months ago
- From Minimal GEMM to Everything☆163Feb 10, 2026Updated 3 weeks ago