tinygrad / tinyosLinks
☆42Updated this week
Alternatives and similar repositories for tinyos
Users that are interested in tinyos are comparing it to the libraries listed below
Sorting:
- An implementation of delta-iris in tinygrad☆72Updated last year
- SIMD quantization kernels☆89Updated last month
- Ultra low overhead NVIDIA GPU telemetry plugin for telegraf with memory temperature readings.☆63Updated last year
- PCCL (Prime Collective Communications Library) implements fault tolerant collective communications over IP☆133Updated last month
- ctypes wrappers for HIP, CUDA, and OpenCL☆130Updated last year
- tiny code to access tenstorrent blackhole☆60Updated 5 months ago
- ☆94Updated last week
- PTX-Tutorial Written Purely By AIs (Deep Research of Openai and Claude 3.7)☆66Updated 7 months ago
- Quantized LLM training in pure CUDA/C++.☆209Updated this week
- Write a fast kernel and run it on Discord. See how you compare against the best!☆58Updated 2 weeks ago
- Hand-Rolled GPU communications library☆54Updated this week
- Transformer GPU VRAM estimator☆67Updated last year
- Learning about CUDA by writing PTX code.☆145Updated last year
- Pytorch script hot swap: Change code without unloading your LLM from VRAM☆124Updated 6 months ago
- Simple Transformer in Jax☆139Updated last year
- Standalone commandline CLI tool for compiling Triton kernels☆18Updated last year
- Solidity contracts for the decentralized Prime Network protocol☆27Updated 3 months ago
- Just large language models. Hackable, with as little abstraction as possible. Done for my own purposes, feel free to rip.☆44Updated 2 years ago
- Make triton easier☆48Updated last year
- DeMo: Decoupled Momentum Optimization☆194Updated 10 months ago
- ☆76Updated this week
- Prepare for DeekSeek R1 inference: Benchmark CPU, DRAM, SSD, iGPU, GPU, ... with efficient code.☆74Updated 8 months ago
- Modded vLLM to run pipeline parallelism over public networks☆39Updated 5 months ago
- Custom PTX Instruction Benchmark☆131Updated 8 months ago
- Solve puzzles to improve your tinygrad skills!☆145Updated 2 weeks ago
- Train neural networks that distill into logic circuits, using JAX☆62Updated 4 months ago
- A really tiny autograd engine☆95Updated 5 months ago
- in this repository, i'm going to implement increasingly complex llm inference optimizations☆68Updated 5 months ago
- 👷 Build compute kernels☆163Updated this week
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆107Updated 7 months ago