ruslangrimov / mnist-minimal-modelLinks
Trying to find out what is the minimal model that can achieve 99% accuracy on MNIST dataset
☆27Updated 7 years ago
Alternatives and similar repositories for mnist-minimal-model
Users that are interested in mnist-minimal-model are comparing it to the libraries listed below
Sorting:
- Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit☆163Updated last year
- bfloat16 dtype for numpy☆19Updated 2 years ago
- Machine Learning Agility (MLAgility) benchmark and benchmarking tools☆40Updated 2 months ago
- The Riallto Open Source Project from AMD☆84Updated 6 months ago
- A Deep Learning Framework for the Posit Number System☆30Updated last year
- A Data-Centric Compiler for Machine Learning☆85Updated last year
- Samples of good AI generated CUDA kernels☆91Updated 4 months ago
- High-Performance SGEMM on CUDA devices☆107Updated 8 months ago
- General Matrix Multiplication using NVIDIA Tensor Cores☆22Updated 8 months ago
- Open source version of ArchGym project.☆121Updated 6 months ago
- Supplementary material for our paper "Compute Trends Across Three Eras of Machine Learning".☆43Updated 3 years ago
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆45Updated last year
- FastFeedForward Networks☆19Updated last year
- Benchmarking PyTorch 2.0 different models☆20Updated 2 years ago
- Attention in SRAM on Tenstorrent Grayskull☆38Updated last year
- Butterfly matrix multiplication in PyTorch☆174Updated 2 years ago
- E2E AutoML Model Compression Package☆46Updated 7 months ago
- Repository of model demos using TT-Buda☆63Updated 6 months ago
- ☆28Updated 3 weeks ago
- Experiment of using Tangent to autodiff triton☆80Updated last year
- ☆28Updated 9 months ago
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆301Updated last week
- ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization☆108Updated last year
- The official, proof-of-concept C++ implementation of PocketNN.☆35Updated 3 weeks ago
- ☆89Updated last year
- A high-efficiency system-on-chip for floating-point compute workloads.☆43Updated 9 months ago
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆45Updated 2 months ago
- ☆21Updated 2 years ago
- ☆52Updated last year
- ☆11Updated 4 years ago