mikex86 / LibreCudaLinks
☆1,062Updated 5 months ago
Alternatives and similar repositories for LibreCuda
Users that are interested in LibreCuda are comparing it to the libraries listed below
Sorting:
- ☆448Updated 6 months ago
- NVIDIA Linux open GPU with P2P support☆1,263Updated 4 months ago
- Multi-Threaded FP32 Matrix Multiplication on x86 CPUs☆365Updated 6 months ago
- ☆248Updated last year
- ☆189Updated last year
- Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, an…☆1,528Updated last week
- Exploring the scalable matrix extension of the Apple M4 processor☆208Updated 11 months ago
- Online compiler for HIP and NVIDIA® CUDA® code to WebGPU☆198Updated 9 months ago
- Apple AMX Instruction Set☆1,161Updated 10 months ago
- Nvidia Instruction Set Specification Generator☆297Updated last year
- Richard is gaining power☆197Updated 4 months ago
- throwaway GPT inference☆140Updated last year
- LLM training in simple, raw C/HIP for AMD GPUs☆51Updated last year
- llama3.np is a pure NumPy implementation for Llama 3 model.☆989Updated 6 months ago
- GGUF implementation in C as a library and a tools CLI program☆291Updated 2 months ago
- Exocompilation for productive programming of hardware accelerators☆676Updated this week
- HIPIFY: Convert CUDA to Portable C++ Code☆625Updated this week
- Algebraic enhancements for GEMM & AI accelerators☆281Updated 8 months ago
- TT-NN operator library, and TT-Metalium low level kernel programming model.☆1,233Updated last week
- GPUOcelot: A dynamic compilation framework for PTX☆210Updated 8 months ago
- Docker-based inference engine for AMD GPUs☆230Updated last year
- An implementation of bucketMul LLM inference☆223Updated last year
- CUDA/Metal accelerated language model inference☆617Updated 4 months ago
- Open-source LLM load balancer and serving platform for self-hosting LLMs at scale 🏓🦙☆1,345Updated this week
- Hashed Lookup Table based Matrix Multiplication (halutmatmul) - Stella Nera accelerator☆214Updated last year
- Felafax is building AI infra for non-NVIDIA GPUs☆568Updated 9 months ago
- SCUDA is a GPU over IP bridge allowing GPUs on remote machines to be attached to CPU-only machines.☆1,759Updated 4 months ago
- Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)☆568Updated 2 years ago
- chipStar is a tool for compiling and running HIP/CUDA on SPIR-V via OpenCL or Level Zero APIs.☆299Updated this week
- Minimal LLM inference in Rust☆1,013Updated last year