ruslangrimov / mnist-minimal-modelLinks
Trying to find out what is the minimal model that can achieve 99% accuracy on MNIST dataset
☆26Updated 7 years ago
Alternatives and similar repositories for mnist-minimal-model
Users that are interested in mnist-minimal-model are comparing it to the libraries listed below
Sorting:
- Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit☆162Updated last year
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆46Updated last year
- High-Performance SGEMM on CUDA devices☆101Updated 8 months ago
- Supplementary material for our paper "Compute Trends Across Three Eras of Machine Learning".☆42Updated 3 years ago
- Machine Learning Agility (MLAgility) benchmark and benchmarking tools☆39Updated last month
- ☆12Updated 4 years ago
- bfloat16 dtype for numpy☆19Updated 2 years ago
- The Riallto Open Source Project from AMD☆83Updated 5 months ago
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆299Updated last week
- A Data-Centric Compiler for Machine Learning☆84Updated last year
- Experiment of using Tangent to autodiff triton☆81Updated last year
- FastFeedForward Networks☆20Updated last year
- Hashed Lookup Table based Matrix Multiplication (halutmatmul) - Stella Nera accelerator☆213Updated last year
- Samples of good AI generated CUDA kernels☆90Updated 3 months ago
- ☆52Updated last year
- Open source version of ArchGym project.☆120Updated 5 months ago
- A Deep Learning Framework for the Posit Number System☆30Updated last year
- PyTorch extension for emulating FP8 data formats on standard FP32 Xeon/GPU hardware.☆111Updated 9 months ago
- ☆27Updated 3 weeks ago
- Attention in SRAM on Tenstorrent Grayskull☆38Updated last year
- ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization☆111Updated 11 months ago
- Example for running IREE in a bare-metal Arm environment.☆40Updated last month
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆45Updated last month
- Benchmarking PyTorch 2.0 different models☆20Updated 2 years ago
- General Matrix Multiplication using NVIDIA Tensor Cores☆20Updated 8 months ago
- A lightweight, Pythonic, frontend for MLIR☆80Updated last year
- ☆28Updated 8 months ago
- Fork of upstream onnxruntime focused on supporting risc-v accelerators☆87Updated 2 years ago
- Generate versal system design from ONNX model. AI engine kernels. Sub-microsecond speeds for autoencoders.☆14Updated 8 months ago
- train with kittens!☆62Updated 11 months ago