mikex86 / LibreCudaLinks

☆1,062

Alternatives and similar repositories for LibreCuda

Users that are interested in LibreCuda are comparing it to the libraries listed below

Sorting:

tinygrad / 7900xtx
☆448Updated 6 months ago
tinygrad / open-gpu-kernel-modules
NVIDIA Linux open GPU with P2P support
☆1,263Updated 4 months ago
salykova / sgemm.c
Multi-Threaded FP32 Matrix Multiplication on x86 CPUs
☆365Updated 6 months ago
mlecauchois / micrograd-cuda
☆248Updated last year
AMD-AGI / AMD-LLM
☆189Updated last year
ashvardanian / SimSIMD
Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, an…
☆1,528Updated last week
tzakharko / m4-sme-exploration
Exploring the scalable matrix extension of the Apple M4 processor
☆208Updated 11 months ago
lights0123 / hipscript
Online compiler for HIP and NVIDIA® CUDA® code to WebGPU
☆198Updated 9 months ago
corsix / amx
Apple AMX Instruction Set
☆1,161Updated 10 months ago
kuterd / nv_isa_solver
Nvidia Instruction Set Specification Generator
☆297Updated last year
robjinman / richard
Richard is gaining power
☆197Updated 4 months ago
a1k0n / a1gpt
throwaway GPT inference
☆140Updated last year
anthonix / llm.c
LLM training in simple, raw C/HIP for AMD GPUs
☆51Updated last year
likejazz / llama3.np
llama3.np is a pure NumPy implementation for Llama 3 model.
☆989Updated 6 months ago
antirez / gguf-tools
GGUF implementation in C as a library and a tools CLI program
☆291Updated 2 months ago
exo-lang / exo
Exocompilation for productive programming of hardware accelerators
☆676Updated this week
ROCm / HIPIFY
HIPIFY: Convert CUDA to Portable C++ Code
☆625Updated this week
trevorpogue / algebraic-nnhw
Algebraic enhancements for GEMM & AI accelerators
☆281Updated 8 months ago
tenstorrent / tt-metal
TT-NN operator library, and TT-Metalium low level kernel programming model.
☆1,233Updated last week
gpuocelot / gpuocelot
GPUOcelot: A dynamic compilation framework for PTX
☆210Updated 8 months ago
slashml / amd_inference
Docker-based inference engine for AMD GPUs
☆230Updated last year
kolinko / effort
An implementation of bucketMul LLM inference
☆223Updated last year
zeux / calm
CUDA/Metal accelerated language model inference
☆617Updated 4 months ago
intentee / paddler
Open-source LLM load balancer and serving platform for self-hosting LLMs at scale 🏓🦙
☆1,345Updated this week
joennlae / halutmatmul
Hashed Lookup Table based Matrix Multiplication (halutmatmul) - Stella Nera accelerator
☆214Updated last year
felafax / felafax
Felafax is building AI infra for non-NVIDIA GPUs
☆568Updated 9 months ago
kevmo314 / scuda
SCUDA is a GPU over IP bridge allowing GPUs on remote machines to be attached to CPU-only machines.
☆1,759Updated 4 months ago
Maknee / minigpt4.cpp
Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)
☆568Updated 2 years ago
CHIP-SPV / chipStar
chipStar is a tool for compiling and running HIP/CUDA on SPIR-V via OpenCL or Level Zero APIs.
☆299Updated this week
samuel-vitorino / lm.rs
Minimal LLM inference in Rust
☆1,013Updated last year