elinx / ugradLinks
A C++ implementation of the scalar-valued autograd engine micrograd
☆23Updated 5 years ago
Alternatives and similar repositories for ugrad
Users that are interested in ugrad are comparing it to the libraries listed below
Sorting:
- Neural network from scratch in CUDA/C++☆85Updated 7 months ago
- Multi-Threaded FP32 Matrix Multiplication on x86 CPUs☆351Updated 4 months ago
- A recurrent (LSTM) neural network in C☆94Updated 3 years ago
- A header only library implementing common mathematical functions using SIMD intrinsics☆112Updated this week
- ☆106Updated 2 years ago
- A C++ port of karpathy/llm.c features a tiny torch library while maintaining overall simplicity.☆35Updated last year
- Pure C ONNX runtime with zero dependancies for embedded devices☆210Updated last year
- MLIR based Tiny Graph Compiler [dev-stage]☆20Updated 9 months ago
- LLM training in simple, raw C/CUDA☆104Updated last year
- A simple and fast minimalistic header-only library allowing to run async tasks and execute task graphs.☆56Updated 9 months ago
- Neural Network framework using Backpropogation in C☆76Updated 3 years ago
- C++ demo of deep neural networks (MLP, CNN)☆33Updated last year
- Source code for 'Modern Parallel Programming with C++ and Assembly' by Dan Kusswurm☆64Updated 3 years ago
- Serial and parallel implementations of matrix multiplication☆42Updated 4 years ago
- A single header-only C++ library for automatic / algorithmic differentiation.☆14Updated 2 years ago
- A c/c++ implementation of micrograd: a tiny autograd engine with neural net on top.☆71Updated last year
- CUDA Matrix Multiplication Optimization☆218Updated last year
- Learn OpenCL step by step.☆137Updated 3 years ago
- My C++ deep learning framework & other machine learning algorithms☆88Updated 2 years ago
- Learn OpenMP examples step by step☆96Updated 7 months ago
- A neural network implementation for the MNIST dataset, written in plain C☆97Updated 4 years ago
- A faithful clone of Karpathy's llama2.c (one file inference, zero dependency) but fully functional with LLaMA 3 8B base and instruct mode…☆135Updated last year
- Short examples illustrating AVX2 intrinsics for simple tasks.☆96Updated last year
- Converting a deep neural network to integer-only inference in native C via uniform quantization and the fixed-point representation.☆25Updated 3 years ago
- GPT-2 in C☆75Updated 8 months ago
- Teaching Vectorization and SIMD using Intel Intrinsics in a Computer Organization and Architecture class☆15Updated 6 months ago
- Learning about CUDA by writing PTX code.☆135Updated last year
- An Open Convolutional Neural Network Framework in C++ From Scratch☆66Updated 4 years ago
- A collection of Fast Fourier Transform algorithms implemented in C++20.☆114Updated last year
- Some CUDA example code with READMEs.☆170Updated 6 months ago