MurageKibicho / Quasi-Quantum-AssemblyLinks
The Quasi Quantum Assembly Programming Language
☆36Updated last month
Alternatives and similar repositories for Quasi-Quantum-Assembly
Users that are interested in Quasi-Quantum-Assembly are comparing it to the libraries listed below
Sorting:
- Train neural networks that distill into logic circuits, using JAX☆64Updated 6 months ago
- tiny code to access tenstorrent blackhole☆61Updated 7 months ago
- Tensor library & inference framework for machine learning☆118Updated 3 months ago
- Because it's there.☆16Updated last year
- asynchronous/distributed speculative evaluation for llama3☆39Updated last year
- CUDA-L2: Surpassing cuBLAS Performance for Matrix Multiplication through Reinforcement Learning☆270Updated 3 weeks ago
- Pytorch script hot swap: Change code without unloading your LLM from VRAM☆125Updated 8 months ago
- a categorical deep learning compiler☆207Updated 3 months ago
- A massively parallel, optimal functional runtime in Rust☆31Updated last year
- PDLP algorithm for linear programming☆84Updated last week
- High-Performance SGEMM on CUDA devices☆114Updated 11 months ago
- Framework for specifying and proving properties—such as robustness, fairness, and interpretability—of machine learning models using Lean …☆73Updated 5 months ago
- A package for defining deep learning models using categorical algebraic expressions.☆61Updated last year
- A tiny autograd engine with a Jax-like API☆74Updated 6 months ago
- Lightweight Llama 3 8B Inference Engine in CUDA C☆53Updated 9 months ago
- Cuq: A MIR-to-Coq Framework Targeting PTX for Formal Semantics and Verified Translation of Rust GPU Kernels☆116Updated last week
- 8-bit computational substrates☆47Updated last year
- Custom PTX Instruction Benchmark☆137Updated 10 months ago
- A fork of OpenBLAS with Armv8-A SVE (Scalable Vector Extension) support☆17Updated 5 years ago
- A probabilistic approximate DNF counter☆39Updated last month
- A high-performance attention mechanism that computes softmax normalization in a single streaming pass using running accumulators (online …☆28Updated 2 months ago
- Can I make an *optimizing* compiler under 1k lines of code?☆65Updated 10 months ago
- ctypes wrappers for HIP, CUDA, and OpenCL☆130Updated last year
- An implementation of Tiny Recursive Models (TRM)☆88Updated this week
- ☆42Updated last week
- Learning about CUDA by writing PTX code.☆151Updated last year
- Training GPTs to solve interaction nets☆18Updated last year
- Samples of good AI generated CUDA kernels☆99Updated 7 months ago
- Better bindings for Python☆19Updated 2 years ago
- ☆130Updated this week