nomic-ai / komputeLinks
General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usecases. Backed by the Linux Foundation.
☆52Updated 6 months ago
Alternatives and similar repositories for kompute
Users that are interested in kompute are comparing it to the libraries listed below
Sorting:
- A minimalistic C++ Jinja templating engine for LLM chat templates☆170Updated 3 weeks ago
- Inference of Mamba models in pure C☆191Updated last year
- ☆62Updated last year
- Prepare for DeekSeek R1 inference: Benchmark CPU, DRAM, SSD, iGPU, GPU, ... with efficient code.☆73Updated 6 months ago
- llama.cpp fork used by GPT4All☆56Updated 6 months ago
- Python bindings for ggml☆146Updated 11 months ago
- Port of Microsoft's BioGPT in C/C++ using ggml☆86Updated last year
- GGML implementation of BERT model with Python bindings and quantization.☆56Updated last year
- RWKV in nanoGPT style☆192Updated last year
- GPT2 implementation in C++ using Ort☆26Updated 4 years ago
- LLM-based code completion engine☆193Updated 7 months ago
- ggml implementation of embedding models including SentenceTransformer and BGE☆59Updated last year
- AMD related optimizations for transformer models☆83Updated last week
- Thin wrapper around GGML to make life easier☆40Updated 2 months ago
- instinct.cpp provides ready to use alternatives to OpenAI Assistant API and built-in utilities for developing AI Agent applications (RAG,…☆53Updated last year
- Port of Suno AI's Bark in C/C++ for fast inference☆52Updated last year
- Train your own small bitnet model☆75Updated 10 months ago
- TTS support with GGML☆160Updated this week
- High-Performance Text Deduplication Toolkit☆49Updated this week
- Embeddings focused small version of Llama NLP model☆104Updated 2 years ago
- llama.cpp to PyTorch Converter☆34Updated last year
- Experiments with BitNet inference on CPU☆54Updated last year
- No-code CLI designed for accelerating ONNX workflows☆208Updated 2 months ago
- A C++ port of karpathy/llm.c features a tiny torch library while maintaining overall simplicity.☆35Updated last year
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆88Updated this week
- Course Project for COMP4471 on RWKV☆17Updated last year
- Inference RWKV v7 in pure C.☆38Updated this week
- C API for MLX☆124Updated last month
- An innovative library for efficient LLM inference via low-bit quantization☆349Updated last year
- Onboarding documentation source for the AMD Ryzen™ AI Software Platform. The AMD Ryzen™ AI Software Platform enables developers to take…☆77Updated 2 weeks ago