EurekaLabsAI / tensorLinks

The Tensor (or Array)

☆452

Alternatives and similar repositories for tensor

Users that are interested in tensor are comparing it to the libraries listed below

Sorting:

EurekaLabsAI / micrograd
The Autograd Engine
☆662Updated last year
clu0 / unet.cu
UNet diffusion model in pure CUDA
☆651Updated last year
ash-01xor / bpe.c
Simple Byte pair Encoding mechanism used for tokenization process . written purely in C
☆139Updated 11 months ago
Quentin-Anthony / nanoMPI
Simple MPI implementation for prototyping or learning
☆287Updated 3 months ago
ulrichstern / cuda-convnet
Alex Krizhevsky's original code from Google Code
☆199Updated 9 years ago
rkinas / triton-resources
A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.
☆427Updated 7 months ago
changjonathanc / flex-nano-vllm
FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.
☆301Updated this week
gautierdag / bpeasy
Fast bare-bones BPE for modern tokenizer training
☆168Updated 4 months ago
MarioSieg / magnetron
(WIP) A small but powerful, homemade PyTorch from scratch.
☆650Updated last week
kvfrans / jax-diffusion-transformer
Implementation of Diffusion Transformer (DiT) in JAX
☆293Updated last year
smolorg / smolgrad
small auto-grad engine inspired from Karpathy's micrograd and PyTorch
☆276Updated 11 months ago
Maharshi-Pandya / cudacodes
Learnings and programs related to CUDA
☆422Updated 4 months ago
mesozoic-egg / tinygrad-notes
Tutorials on tinygrad
☆436Updated 3 weeks ago
EleutherAI / cookbook
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.
☆819Updated 3 months ago
Laz4rz / GPT-2
Following master Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish
☆172Updated last year
MekkCyber / TritonAcademy
A repository to unravel the language of GPUs, making their kernel conversations easy to understand
☆195Updated 5 months ago
Quentin-Anthony / torch-profiling-tutorial
☆523Updated 3 months ago
gpu-mode / profiling-cuda-in-torch
☆174Updated last year
BobMcDear / attorch
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
☆580Updated 2 months ago
IST-DASLab / llmq
Quantized LLM training in pure CUDA/C++.
☆214Updated this week
a-hamdi / GPU
100 days of building GPU kernels!
☆523Updated 6 months ago
rwitten / HighPerfLLMs2024
☆545Updated last year
obadakhalili / tinygrad-tensor-puzzles
Solve puzzles to improve your tinygrad skills!
☆146Updated 3 weeks ago
linjames0 / Transformer-CUDA
An implementation of the transformer architecture onto an Nvidia CUDA kernel
☆192Updated 2 years ago
jax-ml / scaling-book
Home for "How To Scale Your Model", a short blog-style textbook about scaling LLMs on TPUs
☆675Updated 2 weeks ago
LambdaLabsML / distributed-training-guide
Best practices & guides on how to write distributed pytorch training code
☆526Updated 2 weeks ago
unixpickle / learn-ptx
Learning about CUDA by writing PTX code.
☆146Updated last year
joey00072 / Tinytorch
A really tiny autograd engine
☆96Updated 5 months ago
facebookresearch / optimizers
For optimization algorithm research and development.
☆542Updated last week
karpathy / calorie
nice and effective super simple calorie counter web app
☆102Updated last year