Const-me / Cgml
GPU-targeted vendor-agnostic AI library for Windows, and Mistral model implementation.
☆53Updated last year
Alternatives and similar repositories for Cgml:
Users that are interested in Cgml are comparing it to the libraries listed below
- A library for incremental loading of large PyTorch checkpoints☆56Updated last year
- Tiny Dream - An embedded, Header Only, Stable Diffusion C++ implementation☆257Updated last year
- Experiments with BitNet inference on CPU☆52Updated 9 months ago
- GGML implementation of BERT model with Python bindings and quantization.☆52Updated 11 months ago
- C++ raytracer that supports custom models. Supports running the calculations on the CPU using C++11 threads or in the GPU via CUDA.☆75Updated 2 years ago
- A CLI to manage install and configure llama inference implemenation in multiple languages☆65Updated last year
- Code sample showing how to run and benchmark models on Qualcomm's Window PCs☆91Updated 3 months ago
- throwaway GPT inference☆140Updated 7 months ago
- Port of Suno AI's Bark in C/C++ for fast inference☆55Updated 9 months ago
- Lightweight Llama 3 8B Inference Engine in CUDA C☆42Updated last week
- An implementation of bucketMul LLM inference☆214Updated 6 months ago
- A live multiplayer trivia game where users can bid for the subject of the next question☆22Updated 2 months ago
- Algebraic enhancements for deep learning accelerator architectures☆264Updated this week
- A fork of llama3.c used to do some R&D on inferencing☆17Updated 3 weeks ago
- ☆53Updated 4 months ago
- ☆163Updated 7 months ago
- A Javascript library (with Typescript types) to parse metadata of GGML based GGUF files.☆44Updated 5 months ago
- Image Generation API Server - Similar to https://text-generator.io but for images☆49Updated last month
- GGUF implementation in C as a library and a tools CLI program☆251Updated last week
- llama.cpp fork with additional SOTA quants and improved performance☆126Updated this week
- Mistral7B playing DOOM☆123Updated 6 months ago
- ☆180Updated 4 months ago
- Richard is gaining power☆181Updated last month
- iterate quickly with llama.cpp hot reloading. use the llama.cpp bindings with bun.sh☆48Updated last year
- Wang Yi's GPT solution☆142Updated last year
- Visual inference exploration & experimentation playground☆84Updated last month
- A tiny version of GPT fully implemented in Python with zero dependencies☆61Updated last month
- Simple LLM inference server☆19Updated 7 months ago
- Online compiler for HIP and NVIDIA® CUDA® code to WebGPU☆121Updated last week
- A graphics engine that executes entirely on the CPU☆220Updated 10 months ago