Simple MPI implementation for prototyping or learning
☆311Aug 6, 2025Updated 8 months ago
Alternatives and similar repositories for nanoMPI
Users that are interested in nanoMPI are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Custom triton kernels for training Karpathy's nanoGPT.☆19Oct 21, 2024Updated last year
- Implementing DeepSeek R1's GRPO algorithm from scratch☆1,834Apr 18, 2025Updated last year
- Simple Byte pair Encoding mechanism used for tokenization process . written purely in C☆148Nov 11, 2024Updated last year
- Minimalistic 4D-parallelism distributed training framework for education purpose☆2,146Aug 26, 2025Updated 7 months ago
- CIFAR-10 speedrun: Trains to 94% accuracy in 1.98 seconds on a single NVIDIA A100 GPU.☆73Oct 17, 2025Updated 6 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- workflow of nndeploy☆13Nov 5, 2025Updated 5 months ago
- An experimental communicating attention kernel based on DeepEP.☆35Jul 29, 2025Updated 8 months ago
- UNet diffusion model in pure CUDA☆657Jun 28, 2024Updated last year
- Tile primitives for speedy kernels