wozeparrot / tinyrwkvLinks

tinygrad port of the RWKV large language model.

☆44

Alternatives and similar repositories for tinyrwkv

Users that are interested in tinyrwkv are comparing it to the libraries listed below

Sorting:

lachlansneff / sparsellama
☆40Updated 2 years ago
kroggen / mamba.c
Inference of Mamba models in pure C
☆192Updated last year
BlinkDL / nanoRWKV
RWKV in nanoGPT style
☆195Updated last year
harrisonvanderbyl / rwkv-cpp-accelerated
A torchless, c++ rwkv implementation using 8bit quantization, written in cuda/hip/vulkan for maximum compatibility and minimum dependenci…
☆313Updated last year
harrisonvanderbyl / rwkvstic
Framework agnostic python runtime for RWKV models
☆146Updated 2 years ago
euclaise / SlimTrainer
Full finetuning of large language models without large memory requirements
☆93Updated last month
BlinkDL / fast.c
Prepare for DeekSeek R1 inference: Benchmark CPU, DRAM, SSD, iGPU, GPU, ... with efficient code.
☆73Updated 9 months ago
astramind-ai / BitMat
An efficent implementation of the method proposed in "The Era of 1-bit LLMs"
☆154Updated last year
euclaise / supertrainer2000
☆50Updated last year
sdan / selfextend
an implementation of Self-Extend, to expand the context window via grouped attention
☆119Updated last year
NolanoOrg / sparse_quant_llms
SparseGPT + GPTQ Compression of LLMs like LLaMa, OPT, Pythia
☆40Updated 2 years ago
iamlemec / bert.cpp
GGML implementation of BERT model with Python bindings and quantization.
☆56Updated last year
BlinkDL / modded-nanogpt-rwkv
RWKV-7: Surpassing GPT
☆100Updated last year
kaiokendev / cutoff-len-is-context-len
Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit
☆62Updated 2 years ago
SmerkyG / RWKV_Explained
RWKV, in easy to read code
☆72Updated 7 months ago
kayvr / token-hawk
WebGPU LLM inference tuned by hand
☆150Updated 2 years ago
abetlen / ggml-python
Python bindings for ggml
☆146Updated last year
Gryphe / MergeMonster
An unsupervised model merging algorithm for Transformers-based language models.
☆106Updated last year
pranavjad / tinyllama-bitnet
Train your own small bitnet model
☆74Updated last year
GreenBitAI / low_bit_llama
Advanced Ultra-Low Bitrate Compression Techniques for the LLaMA Family of LLMs
☆110Updated last year
lukasVierling / FaceRWKV
Course Project for COMP4471 on RWKV
☆17Updated last year
catid / bitnet_cpu
Experiments with BitNet inference on CPU
☆54Updated last year
BlinkDL / WorldModel
Let us make Psychohistory (as in Asimov) a reality, and accessible to everyone. Useful for LLM grounding and games / fiction / business /…
☆39Updated 2 years ago
serp-ai / Parameter-Efficient-MoE
Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks
☆31Updated last year
jadechip / nanoXLSTM
The simplest, fastest repository for training/finetuning medium-sized xLSTMs.
☆41Updated last year
rmihaylov / mpttune
Tune MPTs
☆84Updated 2 years ago
PABannier / biogpt.cpp
Port of Microsoft's BioGPT in C/C++ using ggml
☆85Updated last year
RWKV / RWKV-infctx-trainer
RWKV infctx trainer, for training arbitary context sizes, to 10k and beyond!
☆146Updated last year
SmerkyG / gptcore
Fast modular code to create and train cutting edge LLMs
☆68Updated last year
IST-DASLab / qmoe
Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".
☆277Updated 2 years ago