tinygrad / tinyosLinks
☆43Updated 3 weeks ago
Alternatives and similar repositories for tinyos
Users that are interested in tinyos are comparing it to the libraries listed below
Sorting:
- An implementation of delta-iris in tinygrad☆72Updated last year
- SIMD quantization kernels☆93Updated 4 months ago
- Ultra low overhead NVIDIA GPU telemetry plugin for telegraf with memory temperature readings.☆63Updated last year
- ctypes wrappers for HIP, CUDA, and OpenCL☆130Updated last year
- PCCL (Prime Collective Communications Library) implements fault tolerant collective communications over IP☆141Updated 4 months ago
- tiny code to access tenstorrent blackhole☆61Updated 7 months ago
- ☆97Updated this week
- Enabling tinygrad compatibility with the Google Edge TPU☆85Updated last year
- Modded vLLM to run pipeline parallelism over public networks☆41Updated 7 months ago
- Quantized LLM training in pure CUDA/C++.☆231Updated this week
- Solidity contracts for the decentralized Prime Network protocol☆27Updated 6 months ago
- look how they massacred my boy☆63Updated last year
- Can RL solve simple problems?☆54Updated 2 years ago
- Custom PTX Instruction Benchmark☆137Updated 10 months ago
- PTX-Tutorial Written Purely By AIs (Deep Research of Openai and Claude 3.7)☆66Updated 9 months ago
- MoE training for Me and You and maybe other people☆319Updated 2 weeks ago
- Write a fast kernel and run it on Discord. See how you compare against the best!☆66Updated last week
- peer-to-peer compute and intelligence network that enables decentralized AI development at scale☆136Updated 2 months ago
- Because it's there.☆16Updated last year
- Official CLI and Python SDK for Prime Intellect - access GPU compute, remote sandboxes, RL environments, and distributed training infrast…☆134Updated this week
- Learning about CUDA by writing PTX code.☆151Updated last year
- 👷 Build compute kernels☆201Updated last week
- Prepare for DeekSeek R1 inference: Benchmark CPU, DRAM, SSD, iGPU, GPU, ... with efficient code.☆73Updated 11 months ago
- ☆22Updated last year
- CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning☆318Updated last week
- The Quasi Quantum Assembly Programming Language☆36Updated 2 months ago
- Solve puzzles to improve your tinygrad skills!☆175Updated 3 months ago
- Simple Transformer in Jax☆142Updated last year
- in this repository, i'm going to implement increasingly complex llm inference optimizations☆81Updated 7 months ago
- DeMo: Decoupled Momentum Optimization☆198Updated last year