trevorpogue / algebraic-nnhwLinks

Algebraic enhancements for GEMM & AI accelerators

☆278

Alternatives and similar repositories for algebraic-nnhw

Users that are interested in algebraic-nnhw are comparing it to the libraries listed below

Sorting:

joennlae / halutmatmul
Hashed Lookup Table based Matrix Multiplication (halutmatmul) - Stella Nera accelerator
☆211Updated last year
moonshine-ai / qc_npu_benchmark
Code sample showing how to run and benchmark models on Qualcomm's Window PCs
☆100Updated 9 months ago
joennlae / tensorli
Absolute minimalistic implementation of a GPT-like transformer using only numpy (<650 lines).
☆253Updated last year
DiscoGrad / DiscoGrad
DiscoGrad - automatically differentiate across conditional branches in C++ programs
☆204Updated 10 months ago
robjinman / richard
Richard is gaining power
☆192Updated last month
slashml / amd_inference
Docker-based inference engine for AMD GPUs
☆231Updated 9 months ago
cjdrake / seqlogic
Sequential Logic
☆111Updated this week
Dicklesworthstone / hoeffdings_d_explainer
A Detailed Introduction to My Favorite Statistical Measure, Hoeffding's D
☆98Updated last year
mlecauchois / micrograd-cuda
☆249Updated last year
a1k0n / a1gpt
throwaway GPT inference
☆140Updated last year
nirw4nna / dsc
Tensor library & inference framework for machine learning
☆103Updated 2 weeks ago
Foreseerr / TScale
☆196Updated 2 months ago
Dicklesworthstone / bakery_algorithm
Lamport's Bakery Algorithm Demonstrated in Python
☆96Updated last year
AMD-AIG-AIMA / AMD-LLM
☆188Updated 11 months ago
westoncb / nonlinear-optics-sandbox
☆34Updated 6 months ago
anordin95 / run-llama-locally
Run and explore Llama models locally with minimal dependencies on CPU
☆191Updated 9 months ago
JosephSBoyle / skip_gram
This is a numpy implementation of the Skip-gram algorithm described in Mikolov et al's Word2Vec paper. It is intended for didactic purpos…
☆36Updated 2 years ago
samvher / bert-for-laptops
A BERT that you can train on a (gaming) laptop.
☆209Updated last year
ross39 / new_bloom_filter_repo
This repo contains a new way to use bloom filters to do lossless video compression
☆247Updated last month
turingmotors / swan
This project aims to enable language model inference on FPGAs, supporting AI applications in edge devices and environments with limited r…
☆163Updated last year
jostmey / NakedAttention
Revealing example of self-attention, the building block of transformer AI models
☆131Updated 2 years ago
brianmg / voynich-nlp-analysis
☆124Updated 2 months ago
Futrell / ziplm
☆252Updated 2 years ago
ivanbelenky / RL
R.L. methods and techniques.
☆199Updated 8 months ago
KienTTran / ABMGPU
Agent Based Model on GPU using CUDA 12.2.1 and OpenGL 4.5 (CUDA OpenGL interop) on Windows/Linux
☆74Updated 4 months ago
maxilevi / raytracer
C++ raytracer that supports custom models. Supports running the calculations on the CPU using C++11 threads or in the GPU via CUDA.
☆76Updated 2 years ago
valine / training-hot-swap
Pytorch script hot swap: Change code without unloading your LLM from VRAM
☆126Updated 3 months ago
heap-exploitation / heap-explorer
LD_PRELOADable library for exploring the glibc heap
☆107Updated 4 months ago
carsonpo / haystackdb
☆163Updated last year
rodlaf / BinaryGPUIndex
A GPU Accelerated Binary Vector Store
☆47Updated 5 months ago