KerfuffleV2 / smolrsrwkv
A relatively basic implementation of RWKV in Rust written by someone with very little math and ML knowledge. Supports 32, 8 and 4 bit evaluation. It can also directly load PyTorch RWKV models.
☆93Updated last year
Alternatives and similar repositories for smolrsrwkv:
Users that are interested in smolrsrwkv are comparing it to the libraries listed below
- ☆32Updated 2 years ago
- GGML bindings that aim to be idiomatic Rust rather than directly corresponding to the C/C++ interface☆19Updated last year
- ☆59Updated 2 years ago
- ☆26Updated last year
- Bleeding edge low level Rust binding for GGML☆16Updated 9 months ago
- A single-binary, GPU-accelerated LLM server (HTTP and WebSocket API) written in Rust☆79Updated last year
- Implementation of the RWKV language model in pure WebGPU/Rust.☆298Updated last week
- High-level, optionally asynchronous Rust bindings to llama.cpp☆217Updated 10 months ago
- A highly customizable, full scale web backend for web-rwkv, built on axum with websocket protocol.☆26Updated last year
- tinygrad port of the RWKV large language model.☆44Updated last month
- Implementing the BitNet model in Rust☆31Updated last year
- A collection of LLM token samplers in Rust☆17Updated last year
- Rust+OpenCL+AVX2 implementation of LLaMA inference code☆544Updated last year
- LLaMa 7b with CUDA acceleration implemented in rust. Minimal GPU memory needed!☆104Updated last year
- Low rank adaptation (LoRA) for Candle.☆142Updated this week
- ☆40Updated 2 years ago
- RWKV models and examples powered by candle.☆18Updated last month
- Inference of Mamba models in pure C☆187Updated last year
- 8-bit floating point types for Rust☆46Updated last month
- Inference Llama 2 in one file of zero-dependency, zero-unsafe Rust☆38Updated last year
- LLaMA from First Principles☆51Updated last year
- A torchless, c++ rwkv implementation using 8bit quantization, written in cuda/hip/vulkan for maximum compatibility and minimum dependenci…☆310Updated last year
- WebGPU LLM inference tuned by hand☆149Updated last year
- ☆23Updated last week
- ☆19Updated 6 months ago
- Rust implementation of Huggingface transformers pipelines using onnxruntime backend with bindings to C# and C.☆38Updated 2 years ago
- Framework agnostic python runtime for RWKV models☆146Updated last year
- GGML implementation of BERT model with Python bindings and quantization.☆56Updated last year
- This project aims to make RWKV Accessible to everyone using a Hugging Face like interface, while keeping it close to the R and D RWKV bra…☆64Updated last year
- Port of Microsoft's BioGPT in C/C++ using ggml☆88Updated last year