ruslangrimov / mnist-minimal-model
Trying to find out what is the minimal model that can achieve 99% accuracy on MNIST dataset
☆25Updated 6 years ago
Alternatives and similar repositories for mnist-minimal-model
Users that are interested in mnist-minimal-model are comparing it to the libraries listed below
Sorting:
- FastFeedForward Networks☆19Updated last year
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆45Updated 10 months ago
- Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit☆148Updated last year
- bfloat16 dtype for numpy☆19Updated last year
- Supplementary material for our paper "Compute Trends Across Three Eras of Machine Learning".☆40Updated 3 years ago
- General Matrix Multiplication using NVIDIA Tensor Cores☆17Updated 3 months ago
- ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization☆106Updated 7 months ago
- Experiment of using Tangent to autodiff triton☆78Updated last year
- E2E AutoML Model Compression Package☆46Updated 2 months ago
- The Riallto Open Source Project from AMD☆77Updated last month
- ☆27Updated 4 months ago
- Repository for CPU Kernel Generation for LLM Inference☆26Updated last year
- Unit Scaling demo and experimentation code☆16Updated last year
- Lightweight Python Wrapper for OpenVINO, enabling LLM inference on NPUs☆20Updated 5 months ago
- A list of awesome neural symbolic papers.☆47Updated 2 years ago
- Adaptive floating-point based numerical format for resilient deep learning☆14Updated 3 years ago
- Machine Learning Agility (MLAgility) benchmark and benchmarking tools☆39Updated this week
- Customized matrix multiplication kernels☆54Updated 3 years ago
- A tiny FP8 multiplication unit written in Verilog. TinyTapeout 2 submission.☆14Updated 2 years ago
- Converting a deep neural network to integer-only inference in native C via uniform quantization and the fixed-point representation.☆23Updated 3 years ago
- PyTorch extension for emulating FP8 data formats on standard FP32 Xeon/GPU hardware.☆110Updated 5 months ago
- ☆47Updated 10 months ago
- High-Performance SGEMM on CUDA devices☆91Updated 3 months ago
- Hacks for PyTorch☆19Updated 2 years ago
- Make triton easier☆47Updated 11 months ago
- Code for the note "NF4 Isn't Information Theoretically Optimal (and that's Good)☆18Updated last year
- ☆18Updated last year
- ☆29Updated last year
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆41Updated last year
- ☆21Updated last year