ScalingIntelligence / tokasaurusLinks
☆440Updated last month
Alternatives and similar repositories for tokasaurus
Users that are interested in tokasaurus are comparing it to the libraries listed below
Sorting:
- Pytorch script hot swap: Change code without unloading your LLM from VRAM☆124Updated 5 months ago
- Storing long contexts in tiny caches with self-study☆194Updated 3 weeks ago
- ☆218Updated 8 months ago
- Reverse Engineering Gemma 3n: Google's New Edge-Optimized Language Model☆244Updated 4 months ago
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆322Updated 11 months ago
- ☆233Updated 7 months ago
- Simple & Scalable Pretraining for Neural Architecture Research☆296Updated last month
- GRPO training code which scales to 32xH100s for long horizon terminal/coding tasks. Base agent is now the top Qwen3 agent on Stanford's T…☆269Updated last month
- SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?☆181Updated last week
- PyTorch Single Controller☆435Updated this week
- ☆225Updated 3 months ago
- Pivotal Token Search☆126Updated 2 months ago
- PyTorch implementation of models from the Zamba2 series.☆185Updated 8 months ago
- ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)☆224Updated this week
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"☆247Updated 8 months ago
- An implementation of bucketMul LLM inference☆223Updated last year
- ArcticInference: vLLM plugin for high-throughput, low-latency inference☆270Updated this week
- code for training & evaluating Contextual Document Embedding models☆197Updated 4 months ago
- rl from zero pretrain, can it be done? yes.☆275Updated last week
- Train your own SOTA deductive reasoning model☆107Updated 7 months ago
- Felafax is building AI infra for non-NVIDIA GPUs☆567Updated 8 months ago
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)☆414Updated last week
- Long context evaluation for large language models☆222Updated 7 months ago
- Checkpoint-engine is a simple middleware to update model weights in LLM inference engines☆755Updated last week
- Simple high-throughput inference library☆142Updated 4 months ago
- LLM Inference on consumer devices☆124Updated 6 months ago
- a curated list of data for reasoning ai☆137Updated last year
- Lightweight Nearest Neighbors with Flexible Backends☆308Updated this week
- Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.☆373Updated this week
- Super-fast Structured Outputs☆539Updated 2 weeks ago