kfish / micrograd-cpp-2023Links
A C++ port of karpathy/micrograd, a tiny scalar-valued autograd engine and a neural net library
☆13Updated last year
Alternatives and similar repositories for micrograd-cpp-2023
Users that are interested in micrograd-cpp-2023 are comparing it to the libraries listed below
Sorting:
- Teaching Vectorization and SIMD using Intel Intrinsics in a Computer Organization and Architecture class☆16Updated 7 months ago
- C implementation of the L-Mul f32/f16 multiplications from paper: https://arxiv.org/html/2410.00907☆28Updated last year
- MLIR-based toolkit targeting intel heterogeneous hardware☆48Updated 7 months ago
- Header-only safetensors loader and saver in C++☆67Updated 5 months ago
- Isolating mlir tutorial dialect implementation☆25Updated 2 months ago
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆56Updated 6 months ago
- Little OpenMP Library☆168Updated 3 years ago
- Generate simple index ranges in C++ and CUDA C++☆39Updated 2 years ago
- ☆23Updated 3 years ago
- Multi-Threaded FP32 Matrix Multiplication on x86 CPUs☆364Updated 5 months ago
- Examples from Programming in Parallel with CUDA☆161Updated 2 years ago
- A minimal (really) out-of-tree MLIR example☆45Updated 2 months ago
- The Farm-SVE package provides a header that implements the ARM C language extensions (ACLE) for the ARM Scalable Vector Extension (SVE) i…☆15Updated last year
- Source code for 'Modern Parallel Programming with C++ and Assembly' by Dan Kusswurm☆65Updated 3 years ago
- A lightweight memory allocator for hardware-accelerated machine learning☆170Updated 2 weeks ago
- GPUOcelot: A dynamic compilation framework for PTX☆210Updated 8 months ago
- Stepwise optimizations of DGEMM on CPU, reaching performance faster than Intel MKL eventually, even under multithreading.☆152Updated 3 years ago
- Efficient implementations of Merge Sort and Bitonic Sort algorithms using CUDA for GPU parallel processing, resulting in accelerated sort…☆18Updated 2 years ago
- Source code for 'Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL' by James Reinders, Ben A…☆278Updated 6 months ago
- Retargetable ML compilers for the twenty-first century!☆13Updated 5 months ago
- SYCL Benchmark Suite☆65Updated 3 months ago
- Website for CS 265☆30Updated 9 months ago
- TPP experimentation on MLIR for linear algebra☆137Updated 2 weeks ago
- A compiler for Tiger language includes lexical analysis using flexc++, parsing using Bisonc++, type checking, building abstract syntax tr…☆13Updated 2 years ago
- amdgpu example code in hip/asm☆43Updated last week
- Intel Data Parallel C++ (and SYCL 2020) Tutorial.☆95Updated 3 years ago
- Compiler agnostic metaprogramming library providing concepts, type operations and tuples for C++ and cuda☆91Updated last week
- ☆152Updated this week
- MLIR based Tiny Graph Compiler [dev-stage]☆19Updated 10 months ago
- ☆17Updated last year