philpax / ggmlLinks
Tensor library for machine learning
☆21Updated last year
Alternatives and similar repositories for ggml
Users that are interested in ggml are comparing it to the libraries listed below
Sorting:
- General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). …☆52Updated 8 months ago
- ☆20Updated last year
- Experiments with BitNet inference on CPU☆54Updated last year
- GGUF parser in Python☆28Updated last year
- Transformer GPU VRAM estimator☆67Updated last year
- Chroma's fork of hnswlib - a header-only C++/python library for fast approximate nearest neighbors☆20Updated 3 weeks ago
- Trying to deconstruct RWKV in understandable terms☆14Updated 2 years ago
- Prepare for DeekSeek R1 inference: Benchmark CPU, DRAM, SSD, iGPU, GPU, ... with efficient code.☆74Updated 8 months ago
- Simple high-throughput inference library☆149Updated 5 months ago
- Public reports detailing responses to sets of prompts by Large Language Models.☆31Updated 9 months ago
- First token cutoff sampling inference example☆30Updated last year
- Port of Facebook's LLaMA model in C/C++☆23Updated last year
- Inference Llama 2 in one file of pure C++☆84Updated 2 years ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆90Updated this week
- inference code for mixtral-8x7b-32kseqlen☆102Updated last year
- Visualize expert firing frequencies across sentences in the Mixtral MoE model☆18Updated last year
- GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing tho…☆114Updated 3 months ago
- Course Project for COMP4471 on RWKV☆17Updated last year
- Port of Microsoft's BioGPT in C/C++ using ggml☆85Updated last year
- GGUF implementation in C as a library and a tools CLI program☆292Updated 2 months ago
- Inference of Large Multimodal Models in C/C++. LLaVA and others☆48Updated 2 years ago
- Chunk Dedupe Estimation☆20Updated 11 months ago
- Port of Facebook's LLaMA model in C/C++☆21Updated last year
- tinygrad port of the RWKV large language model.☆44Updated 7 months ago
- Falcon LLM ggml framework with CPU and GPU support☆247Updated last year
- ☆40Updated 2 years ago
- Tool to download models from Huggingface Hub and convert them to GGML/GGUF for llama.cpp☆160Updated 6 months ago
- Easily convert HuggingFace models to GGUF-format for llama.cpp☆23Updated last year
- ☆51Updated last year
- Python bindings for ggml☆146Updated last year