BlinkDL / nanoRWKV
RWKV in nanoGPT style
☆189Updated 10 months ago
Alternatives and similar repositories for nanoRWKV:
Users that are interested in nanoRWKV are comparing it to the libraries listed below
- RWKV, in easy to read code☆72Updated last month
- Inference of Mamba models in pure C☆188Updated last year
- RWKV-7: Surpassing GPT☆83Updated 5 months ago
- RWKV infctx trainer, for training arbitary context sizes, to 10k and beyond!☆148Updated 8 months ago
- Fast modular code to create and train cutting edge LLMs☆66Updated 11 months ago
- Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".☆274Updated last year
- ☆121Updated 3 weeks ago
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆154Updated 6 months ago
- Evaluating LLMs with Dynamic Data☆86Updated 2 weeks ago
- Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget☆149Updated last year
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆199Updated 9 months ago
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"☆231Updated 3 months ago
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆126Updated 5 months ago
- Token Omission Via Attention☆126Updated 6 months ago
- A large-scale RWKV v6, v7(World, ARWKV, PRWKV) inference. Capable of inference by combining multiple states(Pseudo MoE). Easy to deploy o…☆35Updated this week
- Beyond Language Models: Byte Models are Digital World Simulators☆324Updated 11 months ago
- Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"☆232Updated 2 months ago
- ☆82Updated 11 months ago
- Multipack distributed sampler for fast padding-free training of LLMs☆188Updated 8 months ago
- Experiments on speculative sampling with Llama models☆125Updated last year
- https://x.com/BlinkDL_AI/status/1884768989743882276☆27Updated this week
- Normalized Transformer (nGPT)☆174Updated 5 months ago
- ☆186Updated this week
- Python bindings for ggml☆140Updated 8 months ago
- Inference RWKV v7 in pure C.☆33Updated last month
- Low-bit optimizers for PyTorch☆128Updated last year
- PyTorch implementation of models from the Zamba2 series.☆180Updated 3 months ago
- QuIP quantization☆52Updated last year
- Prepare for DeekSeek R1 inference: Benchmark CPU, DRAM, SSD, iGPU, GPU, ... with efficient code.☆71Updated 3 months ago
- Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees"☆362Updated last year