ruslangrimov / mnist-minimal-modelLinks
Trying to find out what is the minimal model that can achieve 99% accuracy on MNIST dataset
☆27Updated 7 years ago
Alternatives and similar repositories for mnist-minimal-model
Users that are interested in mnist-minimal-model are comparing it to the libraries listed below
Sorting:
- Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit☆169Updated last year
- Open source version of ArchGym project.☆121Updated 6 months ago
- The Riallto Open Source Project from AMD☆84Updated 6 months ago
- Machine Learning Agility (MLAgility) benchmark and benchmarking tools☆40Updated 3 months ago
- General Matrix Multiplication using NVIDIA Tensor Cores☆24Updated 9 months ago
- Samples of good AI generated CUDA kernels☆91Updated 5 months ago
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆45Updated last year
- bfloat16 dtype for numpy☆20Updated 2 years ago
- A Data-Centric Compiler for Machine Learning☆85Updated last year
- Attention in SRAM on Tenstorrent Grayskull☆38Updated last year
- A Deep Learning Framework for the Posit Number System☆30Updated last year
- Butterfly matrix multiplication in PyTorch☆174Updated 2 years ago
- ☆11Updated 4 years ago
- High-Performance SGEMM on CUDA devices☆109Updated 9 months ago
- ☆15Updated last year
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆305Updated last week
- Experiment of using Tangent to autodiff triton☆79Updated last year
- ☆28Updated last month
- ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization☆110Updated last year
- PyTorch extension for emulating FP8 data formats on standard FP32 Xeon/GPU hardware.☆111Updated 11 months ago
- Sparsity support for PyTorch☆37Updated 7 months ago
- Test suite for probing the numerical behavior of NVIDIA tensor cores☆41Updated last year
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆47Updated 2 months ago
- Binary Neural Network-based COVID-19 Face-Mask Wear and Positioning Predictor on Edge Devices☆12Updated 4 years ago
- Hashed Lookup Table based Matrix Multiplication (halutmatmul) - Stella Nera accelerator☆214Updated last year
- 🏙 Interactive performance profiling and debugging tool for PyTorch neural networks.☆64Updated 9 months ago
- E2E AutoML Model Compression Package☆46Updated 7 months ago
- A tiny FP8 multiplication unit written in Verilog. TinyTapeout 2 submission.☆14Updated 2 years ago
- Machine-Learning Accelerator System Exploration Tools☆182Updated last week
- Automatic differentiation for Triton Kernels☆28Updated 2 months ago