ScalingIntelligence / tokasaurusLinks
☆458Updated 2 weeks ago
Alternatives and similar repositories for tokasaurus
Users that are interested in tokasaurus are comparing it to the libraries listed below
Sorting:
- Storing long contexts in tiny caches with self-study☆218Updated this week
- Pytorch script hot swap: Change code without unloading your LLM from VRAM☆125Updated 7 months ago
- ☆219Updated 10 months ago
- Simple & Scalable Pretraining for Neural Architecture Research☆304Updated last month
- Reverse Engineering Gemma 3n: Google's New Edge-Optimized Language Model☆254Updated 6 months ago
- LLM Inference on consumer devices☆125Updated 8 months ago
- ☆234Updated 5 months ago
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆330Updated last year
- Super basic implementation (gist-like) of RLMs with REPL environments.☆278Updated last month
- GRPO training code which scales to 32xH100s for long horizon terminal/coding tasks. Base agent is now the top Qwen3 agent on Stanford's T…☆304Updated 3 months ago
- ☆245Updated 9 months ago
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"☆249Updated 10 months ago
- Simple high-throughput inference library☆150Updated 6 months ago
- SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?☆223Updated 2 weeks ago
- Train your own SOTA deductive reasoning model☆107Updated 9 months ago
- rl from zero pretrain, can it be done? yes.☆281Updated 2 months ago
- ☆344Updated this week
- Pivotal Token Search☆132Updated last week
- ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)☆257Updated this week
- Long context evaluation for large language models☆224Updated 9 months ago
- scalable and robust tree-based speculative decoding algorithm☆363Updated 10 months ago
- ☆213Updated this week
- PyTorch implementation of models from the Zamba2 series.☆186Updated 10 months ago
- Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research☆265Updated this week
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆100Updated 6 months ago
- Felafax is building AI infra for non-NVIDIA GPUs☆569Updated 10 months ago
- code for training & evaluating Contextual Document Embedding models☆201Updated 6 months ago
- Samples of good AI generated CUDA kernels☆92Updated 6 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆108Updated 9 months ago
- ArcticInference: vLLM plugin for high-throughput, low-latency inference☆345Updated this week