philpax / ggmlLinks
Tensor library for machine learning
☆21Updated 2 years ago
Alternatives and similar repositories for ggml
Users that are interested in ggml are comparing it to the libraries listed below
Sorting:
- General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). …☆51Updated 11 months ago
- First token cutoff sampling inference example☆30Updated 2 years ago
- GGUF implementation in C as a library and a tools CLI program☆303Updated 5 months ago
- Python bindings for ggml☆147Updated last year
- ggml implementation of embedding models including SentenceTransformer and BGE☆63Updated 2 years ago
- Tool to download models from Huggingface Hub and convert them to GGML/GGUF for llama.cpp☆170Updated 9 months ago
- GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing tho…☆115Updated 6 months ago
- ToK aka Tree of Knowledge for Large Language Models LLM. It's a novel dataset that inspires knowledge symbolic correlation in simple inpu…☆55Updated 2 years ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆93Updated this week
- Experiments with BitNet inference on CPU☆55Updated last year
- Inference of Mamba and Mamba2 models in pure C☆196Updated 3 weeks ago
- inference code for mixtral-8x7b-32kseqlen☆105Updated 2 years ago
- Port of Facebook's LLaMA model in C/C++☆23Updated last year
- tinygrad port of the RWKV large language model.☆45Updated 11 months ago
- An innovative library for efficient LLM inference via low-bit quantization☆352Updated last year
- Inference Llama 2 in one file of pure C++☆87Updated 2 years ago
- A super simple web interface to perform blind tests on LLM outputs.☆29Updated last year
- Train your own small bitnet model☆77Updated last year
- ☆26Updated 2 years ago
- cortex.llamacpp is a high-efficiency C++ inference engine for edge computing. It is a dynamic library that can be loaded by any server a…☆42Updated 7 months ago
- RWKV in nanoGPT style☆197Updated last year
- Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first app…☆169Updated 2 years ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆53Updated 2 years ago
- Scripts to create your own moe models using mlx☆90Updated last year
- Benchmarking suite for popular AI APIs☆88Updated last year
- 🏥 Health monitor for a Petals swarm☆40Updated last year
- Inference of Large Multimodal Models in C/C++. LLaVA and others☆48Updated 2 years ago
- a simplified version of Google's Gemma model to be used for learning☆26Updated last year
- Onboarding documentation source for the AMD Ryzen™ AI Software Platform. The AMD Ryzen™ AI Software Platform enables developers to take…☆92Updated last week
- GGUF parser in Python☆28Updated last year