tensara / cliLinks
CLI tool for submitting GPU kernels
☆12Updated 6 months ago
Alternatives and similar repositories for cli
Users that are interested in cli are comparing it to the libraries listed below
Sorting:
- speedrun implementation of dl papers throughout history☆33Updated last year
- Keeping track of problems ive solved☆12Updated 3 years ago
- Competitive GPU kernel optimization platform.☆141Updated last week
- Tutorials on tinygrad☆444Updated 2 months ago
- Solve puzzles to improve your tinygrad skills!☆164Updated 2 months ago
- High Quality Resources on GPU Programming/Architecture☆589Updated last year
- could we make an ml stack in 100,000 lines of code?☆46Updated last year
- small auto-grad engine inspired from Karpathy's micrograd and PyTorch☆277Updated last year
- ☆97Updated last week
- A faster, more user-friendly course catalog.☆34Updated 3 weeks ago
- Simple Transformer in Jax☆139Updated last year
- Learning about CUDA by writing PTX code.☆149Updated last year
- parallelized hyperdimensional tictactoe☆126Updated last year
- Following Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish☆172Updated last year
- FastAsk is a Python package that installs an easy to use command to your terminal to get a quick answer to a question, using either OpenA…☆53Updated 11 months ago
- in this repository, i'm going to implement increasingly complex llm inference optimizations☆73Updated 6 months ago
- This repo is my attempt at a rough implementation of nanoGPT trained on a dataset of 30,000 unique Twitter usernames☆24Updated last year
- Ultra low overhead NVIDIA GPU telemetry plugin for telegraf with memory temperature readings.☆63Updated last year
- Learnings and programs related to CUDA☆428Updated 5 months ago
- An implementation of the transformer architecture onto an Nvidia CUDA kernel☆196Updated 2 years ago
- Eric's personal notes from NYSRG https://notes.ekzhang.com/events/nysrg☆47Updated this week
- Alex Krizhevsky's original code from Google Code☆197Updated 9 years ago
- My path from leetcode hell to bigtech heaven☆16Updated last year
- ☆96Updated last year
- Solve Puzzles. Learn Metal 🤘☆593Updated last year
- Complete solutions to the Programming Massively Parallel Processors Edition 4☆602Updated 5 months ago
- a highly efficient compression algorithm for the n1 implant (neuralink's compression challenge)☆46Updated last year
- TransformerCPP is a minimal C++ machine learning library with autograd and tensor ops, inspired by PyTorch. It includes a from-scratch Tr…☆39Updated last month
- Semantic search over every Emergent Ventures winner.☆27Updated last week
- Solve puzzles. Learn CUDA.☆64Updated 2 years ago