philpax / ggml
Tensor library for machine learning
☆21Updated last year
Alternatives and similar repositories for ggml:
Users that are interested in ggml are comparing it to the libraries listed below
- First token cutoff sampling inference example☆29Updated last year
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆87Updated this week
- Train your own small bitnet model☆65Updated 5 months ago
- LLM inference in C/C++☆67Updated last week
- A super simple web interface to perform blind tests on LLM outputs.☆28Updated last year
- Public reports detailing responses to sets of prompts by Large Language Models.☆30Updated 2 months ago
- ☆66Updated 10 months ago
- Experiments with BitNet inference on CPU☆53Updated last year
- Enable moe for nanogpt.☆24Updated last year
- ☆73Updated last year
- Light WebUI for lm.rs☆23Updated 5 months ago
- Course Project for COMP4471 on RWKV☆17Updated last year
- Easily convert HuggingFace models to GGUF-format for llama.cpp☆21Updated 8 months ago
- inference code for mixtral-8x7b-32kseqlen☆99Updated last year
- Chroma's fork of hnswlib - a header-only C++/python library for fast approximate nearest neighbors☆16Updated this week
- Command line tool for Deep Infra cloud ML inference service☆29Updated 9 months ago
- Port of Facebook's LLaMA model in C/C++☆20Updated last year
- Benchmarking tool for assessing LLM models' performance across different hardwares☆16Updated last year
- GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing tho…☆108Updated 3 weeks ago
- ☆46Updated 8 months ago
- ☆12Updated 6 months ago
- Implementation of nougat that focuses on processing pdf locally.☆81Updated 2 months ago
- ☆52Updated 11 months ago
- Prepare for DeekSeek R1 inference: Benchmark CPU, DRAM, SSD, iGPU, GPU, ... with efficient code.☆70Updated 2 months ago
- 1.58-bit LLaMa model☆81Updated 11 months ago
- Self-hosted LLM chatbot arena, with yourself as the only judge☆38Updated last year
- General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). …☆44Updated last month
- Training hybrid models for dummies.☆20Updated 2 months ago
- Github repo for Peifeng's internship project☆14Updated last year
- Visualize expert firing frequencies across sentences in the Mixtral MoE model☆17Updated last year