Const-me / CgmlLinks
GPU-targeted vendor-agnostic AI library for Windows, and Mistral model implementation.
☆58Updated 2 years ago
Alternatives and similar repositories for Cgml
Users that are interested in Cgml are comparing it to the libraries listed below
Sorting:
- A library for incremental loading of large PyTorch checkpoints☆56Updated 2 years ago
- Wang Yi's GPT solution☆142Updated 2 years ago
- A playground to make it easy to try crazy things☆33Updated 2 months ago
- Tiny Dream - An embedded, Header Only, Stable Diffusion C++ implementation☆266Updated 2 years ago
- C++ raytracer that supports custom models. Supports running the calculations on the CPU using C++11 threads or in the GPU via CUDA.☆74Updated 3 years ago
- Richard is gaining power☆200Updated 7 months ago
- A JPEG Image Compression Service using Part Homomorphic Encryption.☆31Updated 11 months ago
- An implementation of bucketMul LLM inference☆224Updated last year
- Mistral7B playing DOOM☆139Updated last year
- Algebraic enhancements for GEMM & AI accelerators☆287Updated 11 months ago
- ☆200Updated 9 months ago
- Heirarchical Navigable Small Worlds☆102Updated 6 months ago
- A graphics engine that executes entirely on the CPU☆226Updated last year
- A CLI to manage install and configure llama inference implemenation in multiple languages☆65Updated 2 years ago
- WebGPU LLM inference tuned by hand☆151Updated 2 years ago
- throwaway GPT inference☆141Updated last year
- Code sample showing how to run and benchmark models on Qualcomm's Window PCs☆104Updated last year
- Absolute minimalistic implementation of a GPT-like transformer using only numpy (<650 lines).☆254Updated 2 years ago
- A copy of ONNX models, datasets, and code all in one GitHub repository. Follow the README to learn more.☆105Updated 2 years ago
- Pytorch script hot swap: Change code without unloading your LLM from VRAM☆125Updated 9 months ago
- A BERT that you can train on a (gaming) laptop.☆209Updated 2 years ago
- ☆63Updated last year
- Official implementation of "WhisperNER: Unified Open Named Entity and Speech Recognition"☆200Updated 11 months ago
- Docker-based inference engine for AMD GPUs☆231Updated last year
- Experiments with BitNet inference on CPU☆55Updated last year
- Revealing example of self-attention, the building block of transformer AI models☆131Updated 2 years ago
- LLaVA server (llama.cpp).☆183Updated 2 years ago
- minimal yet working VPN daemon for Linux☆106Updated 5 months ago
- 33B Chinese LLM, DPO QLORA, 100K context, AirLLM 70B inference with single 4GB GPU☆13Updated last year
- ☆191Updated last year