BlinkDL / nanoRWKV
RWKV in nanoGPT style
☆189Updated 10 months ago
Alternatives and similar repositories for nanoRWKV:
Users that are interested in nanoRWKV are comparing it to the libraries listed below
- RWKV, in easy to read code☆71Updated 3 weeks ago
- Fast modular code to create and train cutting edge LLMs☆66Updated 11 months ago
- Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".☆273Updated last year
- ☆116Updated last month
- RWKV-7: Surpassing GPT☆83Updated 5 months ago
- RWKV infctx trainer, for training arbitary context sizes, to 10k and beyond!☆148Updated 8 months ago
- Evaluating LLMs with Dynamic Data☆78Updated 2 months ago
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆154Updated 6 months ago
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆196Updated 9 months ago
- Inference of Mamba models in pure C☆187Updated last year
- Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget☆146Updated last year
- Multipack distributed sampler for fast padding-free training of LLMs☆187Updated 8 months ago
- Beyond Language Models: Byte Models are Digital World Simulators☆325Updated 10 months ago
- Python bindings for ggml☆140Updated 7 months ago
- Normalized Transformer (nGPT)☆167Updated 4 months ago
- Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees"☆362Updated last year
- PyTorch implementation of models from the Zamba2 series.☆179Updated 2 months ago
- scalable and robust tree-based speculative decoding algorithm☆342Updated 2 months ago
- A torchless, c++ rwkv implementation using 8bit quantization, written in cuda/hip/vulkan for maximum compatibility and minimum dependenci…☆310Updated last year
- ☆182Updated this week
- GPTQ inference Triton kernel☆299Updated last year
- QuIP quantization☆51Updated last year
- Inference RWKV v7 in pure C.☆30Updated 2 weeks ago
- tinygrad port of the RWKV large language model.☆44Updated last month
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"☆230Updated 2 months ago
- ☆82Updated 11 months ago
- https://x.com/BlinkDL_AI/status/1884768989743882276☆27Updated 2 months ago
- Experiments with BitNet inference on CPU☆53Updated last year
- Token Omission Via Attention☆126Updated 6 months ago
- ☆130Updated 4 months ago