elinx / ugrad
A C++ implementation of the scalar-valued autograd engine micrograd
☆23Updated 4 years ago
Alternatives and similar repositories for ugrad:
Users that are interested in ugrad are comparing it to the libraries listed below
- High-Performance FP32 Matrix Multiplication on CPU☆333Updated this week
- LLM training in simple, raw C/CUDA☆91Updated 9 months ago
- An Open Convolutional Neural Network Framework in C++ From Scratch☆60Updated 3 years ago
- Minimal C++ implementation of GPT2☆40Updated last year
- Minimal deep learning library written from scratch in Python, using NumPy/CuPy.☆120Updated 2 years ago
- ☆44Updated 6 years ago
- MLIR based Tiny Graph Compiler [dev-stage]☆15Updated 2 months ago
- Mathematics library for C and C++☆50Updated 11 months ago
- Deep Neural Network Architectures with dlib☆18Updated 3 weeks ago
- A c/c++ implementation of micrograd: a tiny autograd engine with neural net on top.☆63Updated last year
- A header only library implementing common mathematical functions using SIMD intrinsics☆97Updated this week
- pytorch from scratch in pure C/CUDA and python☆40Updated 4 months ago
- Code for NVIDIA's CUDA By Example Book.☆43Updated 4 years ago
- A simple and fast minimalistic header-only library allowing to run async tasks and execute task graphs.☆51Updated 2 months ago
- A C++ port of karpathy/llm.c features a tiny torch library while maintaining overall simplicity.☆24Updated 6 months ago
- Clover: Quantized 4-bit Linear Algebra Library☆112Updated 6 years ago
- Swin Transformer C++ Implementation☆60Updated 3 years ago
- Serial and parallel implementations of matrix multiplication☆39Updated 4 years ago
- High-Performance SGEMM on CUDA devices☆74Updated 3 weeks ago
- ☆131Updated last year
- TinyFive is a lightweight RISC-V emulator and assembler written in Python with neural network examples☆54Updated last year
- Learn OpenCL step by step.☆133Updated 2 years ago
- A fast implementation of log() and exp()☆51Updated 2 years ago
- CUDA Matrix Multiplication Optimization☆161Updated 7 months ago
- An implementation of the transformer architecture onto an Nvidia CUDA kernel☆169Updated last year
- Scalar-valued Automatic Differentiation library in C☆46Updated last year
- NNCG: A Neural Network Code Generator☆35Updated 6 months ago
- Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O☆245Updated last month
- CUDA kernel author's tools☆110Updated 2 years ago