ScalingIntelligence / tokasaurusLinks
☆455Updated 3 weeks ago
Alternatives and similar repositories for tokasaurus
Users that are interested in tokasaurus are comparing it to the libraries listed below
Sorting:
- Storing long contexts in tiny caches with self-study☆216Updated last month
- Pytorch script hot swap: Change code without unloading your LLM from VRAM☆124Updated 7 months ago
- ☆218Updated 9 months ago
- Simple & Scalable Pretraining for Neural Architecture Research☆300Updated 3 weeks ago
- ☆232Updated 4 months ago
- Reverse Engineering Gemma 3n: Google's New Edge-Optimized Language Model☆251Updated 5 months ago
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆327Updated last year
- GRPO training code which scales to 32xH100s for long horizon terminal/coding tasks. Base agent is now the top Qwen3 agent on Stanford's T…☆302Updated 2 months ago
- ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)☆245Updated this week
- SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?☆213Updated this week
- Pivotal Token Search☆131Updated 4 months ago
- ☆237Updated 8 months ago
- LLM Inference on consumer devices☆125Updated 8 months ago
- rl from zero pretrain, can it be done? yes.☆280Updated last month
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"☆249Updated 9 months ago
- PCCL (Prime Collective Communications Library) implements fault tolerant collective communications over IP☆138Updated 2 months ago
- Super basic implementation (gist-like) of RLMs with REPL environments.☆248Updated last month
- ArcticInference: vLLM plugin for high-throughput, low-latency inference☆300Updated this week
- An implementation of bucketMul LLM inference☆223Updated last year
- Simple high-throughput inference library☆149Updated 6 months ago
- code for training & evaluating Contextual Document Embedding models☆200Updated 6 months ago
- Samples of good AI generated CUDA kernels☆91Updated 5 months ago
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache☆130Updated 3 months ago
- Train your own SOTA deductive reasoning model☆107Updated 8 months ago
- Async RL Training at Scale☆770Updated this week
- Checkpoint-engine is a simple middleware to update model weights in LLM inference engines☆829Updated this week
- SIMD quantization kernels☆92Updated 2 months ago
- j1-micro (1.7B) & j1-nano (600M) are absurdly tiny but mighty reward models.☆97Updated 4 months ago
- scalable and robust tree-based speculative decoding algorithm☆362Updated 9 months ago
- Long context evaluation for large language models☆224Updated 8 months ago