elinx / ugradLinks
A C++ implementation of the scalar-valued autograd engine micrograd
☆23Updated 5 years ago
Alternatives and similar repositories for ugrad
Users that are interested in ugrad are comparing it to the libraries listed below
Sorting:
- Multi-Threaded FP32 Matrix Multiplication on x86 CPUs☆356Updated 5 months ago
- Neural network from scratch in CUDA/C++☆85Updated 2 weeks ago
- LLM training in simple, raw C/CUDA☆104Updated last year
- A c/c++ implementation of micrograd: a tiny autograd engine with neural net on top.☆72Updated 2 years ago
- A C++ port of karpathy/llm.c features a tiny torch library while maintaining overall simplicity.☆36Updated last year
- A faithful clone of Karpathy's llama2.c (one file inference, zero dependency) but fully functional with LLaMA 3 8B base and instruct mode…☆138Updated last year
- Learn OpenCL step by step.☆137Updated 3 years ago
- A neural network implementation for the MNIST dataset, written in plain C☆99Updated 4 years ago
- A header only library implementing common mathematical functions using SIMD intrinsics☆112Updated last week
- Scalar-valued Automatic Differentiation library in C☆53Updated 2 years ago
- Learn OpenMP examples step by step☆96Updated 8 months ago
- Neural Network framework using Backpropogation in C☆77Updated 3 years ago
- Teaching Vectorization and SIMD using Intel Intrinsics in a Computer Organization and Architecture class☆16Updated 7 months ago
- Some CUDA example code with READMEs.☆174Updated 6 months ago
- My C++ deep learning framework & other machine learning algorithms☆88Updated 2 years ago
- A C++ port of karpathy/micrograd, a tiny scalar-valued autograd engine and a neural net library☆13Updated last year
- A recurrent (LSTM) neural network in C☆95Updated 3 years ago
- CUDA Matrix Multiplication Optimization☆222Updated last year
- MLIR based Tiny Graph Compiler [dev-stage]☆20Updated 10 months ago
- Pure C ONNX runtime with zero dependancies for embedded devices☆210Updated last year
- Examples from the "C++ From Scratch" Series☆91Updated 2 years ago
- A curated list of awesome GPGPU (CUDA/OpenCL/Vulkan) resources☆101Updated 2 years ago
- Learning about CUDA by writing PTX code.☆135Updated last year
- High-Performance SGEMM on CUDA devices☆101Updated 8 months ago
- GPUOcelot: A dynamic compilation framework for PTX☆208Updated 7 months ago
- ☆46Updated 7 years ago
- ☆109Updated 2 years ago
- Custom PTX Instruction Benchmark☆127Updated 6 months ago
- Serial and parallel implementations of matrix multiplication☆43Updated 4 years ago
- ☆136Updated 2 years ago