elinx / ugradLinks
A C++ implementation of the scalar-valued autograd engine micrograd
☆23Updated 5 years ago
Alternatives and similar repositories for ugrad
Users that are interested in ugrad are comparing it to the libraries listed below
Sorting:
- Multi-Threaded FP32 Matrix Multiplication on x86 CPUs☆377Updated 9 months ago
- A header only library implementing common mathematical functions using SIMD intrinsics☆115Updated 4 months ago
- A C++ port of karpathy/llm.c features a tiny torch library while maintaining overall simplicity.☆42Updated last year
- ☆47Updated 7 years ago
- Accelerated General (FP32) Matrix Multiplication from scratch in CUDA☆182Updated last year
- Serial and parallel implementations of matrix multiplication☆45Updated 4 years ago
- A c/c++ implementation of micrograd: a tiny autograd engine with neural net on top.☆78Updated 2 years ago
- A faithful clone of Karpathy's llama2.c (one file inference, zero dependency) but fully functional with LLaMA 3 8B base and instruct mode…☆144Updated 3 months ago
- LLM training in simple, raw C/CUDA☆112Updated last year
- ☆138Updated 2 years ago
- Neural network from scratch in CUDA/C++☆88Updated 5 months ago
- Learn OpenMP examples step by step☆102Updated last year
- Learning about CUDA by writing PTX code.☆152Updated last year
- A neural network implementation for the MNIST dataset, written in plain C☆102Updated 4 years ago
- A recurrent (LSTM) neural network in C☆95Updated 4 years ago
- NVIDIA tools guide☆157Updated last year
- CUDA Matrix Multiplication Optimization☆256Updated last year
- Header-only safetensors loader and saver in C++☆78Updated last month
- GPUOcelot: A dynamic compilation framework for PTX☆219Updated last year
- High-Performance FP32 GEMM on CUDA devices☆117Updated last year
- Some CUDA example code with READMEs.☆179Updated 3 months ago
- Pure C inference for the GTE Small embedding model☆101Updated 3 weeks ago
- An Open Convolutional Neural Network Framework in C++ From Scratch☆67Updated 4 years ago
- Custom PTX Instruction Benchmark☆138Updated 11 months ago
- A curated list of awesome SIMD frameworks, libraries and software☆233Updated last year
- Pure C ONNX runtime with zero dependancies for embedded devices☆215Updated 2 years ago
- Learn OpenCL step by step.☆138Updated 3 years ago
- NNCG: A Neural Network Code Generator☆35Updated last year
- A curated list of awesome GPGPU (CUDA/OpenCL/Vulkan) resources☆104Updated 2 years ago
- A plugin for Jupyter Notebook to run CUDA C/C++ code☆257Updated last year