AMD-AIG-AIMA / AMD-LLM
☆186Updated 8 months ago
Alternatives and similar repositories for AMD-LLM:
Users that are interested in AMD-LLM are comparing it to the libraries listed below
- ☆164Updated this week
- Code sample showing how to run and benchmark models on Qualcomm's Window PCs☆96Updated 7 months ago
- Docker-based inference engine for AMD GPUs☆230Updated 6 months ago
- An implementation of bucketMul LLM inference☆217Updated 10 months ago
- ☆241Updated last year
- Run and explore Llama models locally with minimal dependencies on CPU☆189Updated 6 months ago
- Dead Simple LLM Abliteration☆212Updated 2 months ago
- ☆163Updated 11 months ago
- Algebraic enhancements for GEMM & AI accelerators☆275Updated 2 months ago
- Richard is gaining power☆186Updated 5 months ago
- Pytorch script hot swap: Change code without unloading your LLM from VRAM☆123Updated 2 weeks ago
- Absolute minimalistic implementation of a GPT-like transformer using only numpy (<650 lines).☆251Updated last year
- Bayesian Optimization as a Coverage Tool for Evaluating LLMs. Accurate evaluation (benchmarking) that's 10 times faster with just a few l…☆283Updated this week
- Online compiler for HIP and NVIDIA® CUDA® code to WebGPU☆147Updated 3 months ago
- A copy of ONNX models, datasets, and code all in one GitHub repository. Follow the README to learn more.☆105Updated last year
- throwaway GPT inference☆139Updated 11 months ago
- GGUF implementation in C as a library and a tools CLI program☆269Updated 3 months ago
- This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient …☆221Updated 4 months ago
- A CLI to manage install and configure llama inference implemenation in multiple languages☆66Updated last year
- Felafax is building AI infra for non-NVIDIA GPUs☆559Updated 3 months ago
- Fully Open Language Models with Stellar Performance☆229Updated last month
- RebootX On-Prem is an open source specification for defining a custom server in order to manage on-premise runnables and dashboards in th…☆100Updated 3 weeks ago
- minimal yet working VPN daemon for Linux☆106Updated this week
- A GPU Accelerated Binary Vector Store☆47Updated 2 months ago
- Agent Based Model on GPU using CUDA 12.2.1 and OpenGL 4.5 (CUDA OpenGL interop) on Windows/Linux☆70Updated 2 months ago
- Stateful load balancer custom-tailored for llama.cpp 🏓🦙☆747Updated this week
- A BERT that you can train on a (gaming) laptop.☆208Updated last year
- 100k real ( +100k random ) galaxies from a sector. Visualized with Raylib.☆87Updated last month
- Lamport's Bakery Algorithm Demonstrated in Python☆96Updated last year
- Neurox control helm chart details☆27Updated this week