Const-me / CgmlLinks
GPU-targeted vendor-agnostic AI library for Windows, and Mistral model implementation.
☆57Updated last year
Alternatives and similar repositories for Cgml
Users that are interested in Cgml are comparing it to the libraries listed below
Sorting:
- A fork of llama3.c used to do some R&D on inferencing☆22Updated 5 months ago
- C++ raytracer that supports custom models. Supports running the calculations on the CPU using C++11 threads or in the GPU via CUDA.☆74Updated 2 years ago
- A JPEG Image Compression Service using Part Homomorphic Encryption.☆30Updated 2 months ago
- A library for incremental loading of large PyTorch checkpoints☆56Updated 2 years ago
- A playground to make it easy to try crazy things☆33Updated last month
- Port of Suno AI's Bark in C/C++ for fast inference☆51Updated last year
- A web-app to explore topics using LLM (less typing and more clicks)☆67Updated last year
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization, with PyTorch/CUDA☆37Updated last year
- Code sample showing how to run and benchmark models on Qualcomm's Window PCs☆97Updated 8 months ago
- Richard is gaining power☆187Updated 6 months ago
- A CLI to manage install and configure llama inference implemenation in multiple languages☆66Updated last year
- ☆192Updated last month
- Heirarchical Navigable Small Worlds☆96Updated last month
- Tiny Dream - An embedded, Header Only, Stable Diffusion C++ implementation☆261Updated last year
- throwaway GPT inference☆139Updated last year
- LLaVA server (llama.cpp).☆179Updated last year
- ☆163Updated last year
- LLM inference in C/C++☆23Updated 8 months ago
- A copy of ONNX models, datasets, and code all in one GitHub repository. Follow the README to learn more.☆105Updated last year
- Lightweight Llama 3 8B Inference Engine in CUDA C☆46Updated 2 months ago
- DiscoGrad - automatically differentiate across conditional branches in C++ programs☆202Updated 8 months ago
- Simple LLM inference server☆20Updated 11 months ago
- GGML implementation of BERT model with Python bindings and quantization.☆54Updated last year
- Experiments with BitNet inference on CPU☆55Updated last year
- The procedure and the code to run shap-e sample code locally.☆116Updated last year
- Local LLM inference & management server with built-in OpenAI API☆31Updated last year
- A fork of OpenBLAS with Armv8-A SVE (Scalable Vector Extension) support☆17Updated 5 years ago
- Mistral7B playing DOOM☆131Updated 10 months ago
- An implementation of bucketMul LLM inference☆217Updated 11 months ago
- A Javascript library (with Typescript types) to parse metadata of GGML based GGUF files.☆46Updated 10 months ago