ScalingIntelligence / tokasaurusLinks
☆461Updated last month
Alternatives and similar repositories for tokasaurus
Users that are interested in tokasaurus are comparing it to the libraries listed below
Sorting:
- Storing long contexts in tiny caches with self-study☆228Updated 3 weeks ago
- ☆219Updated 11 months ago
- Pytorch script hot swap: Change code without unloading your LLM from VRAM☆125Updated 8 months ago
- MoE training for Me and You and maybe other people☆298Updated 2 weeks ago
- GRPO training code which scales to 32xH100s for long horizon terminal/coding tasks. Base agent is now the top Qwen3 agent on Stanford's T…☆317Updated 4 months ago
- Simple & Scalable Pretraining for Neural Architecture Research☆305Updated 3 weeks ago
- Reverse Engineering Gemma 3n: Google's New Edge-Optimized Language Model☆255Updated 7 months ago
- ☆234Updated 6 months ago
- Super basic implementation (gist-like) of RLMs with REPL environments.☆290Updated 2 months ago
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"☆249Updated 11 months ago
- LLM Inference on consumer devices☆128Updated 9 months ago
- SIMD quantization kernels☆93Updated 3 months ago
- ☆252Updated 9 months ago
- Train your own SOTA deductive reasoning model☆107Updated 9 months ago
- Long context evaluation for large language models☆224Updated 9 months ago
- Pivotal Token Search☆141Updated last week
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆344Updated last year
- PyTorch implementation of models from the Zamba2 series.☆186Updated 11 months ago
- ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)☆263Updated this week
- Samples of good AI generated CUDA kernels☆95Updated 7 months ago
- Felafax is building AI infra for non-NVIDIA GPUs☆569Updated 11 months ago
- ☆72Updated last month
- PCCL (Prime Collective Communications Library) implements fault tolerant collective communications over IP☆141Updated 3 months ago
- Simple high-throughput inference library☆153Updated 7 months ago
- scalable and robust tree-based speculative decoding algorithm☆366Updated 11 months ago
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache☆136Updated 4 months ago
- rl from zero pretrain, can it be done? yes.☆282Updated 3 months ago
- Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.☆435Updated 2 weeks ago
- SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?☆236Updated last month
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆109Updated 9 months ago