ScalingIntelligence / tokasaurusLinks

☆369

Alternatives and similar repositories for tokasaurus

Users that are interested in tokasaurus are comparing it to the libraries listed below

Sorting:

valine / training-hot-swap
Pytorch script hot swap: Change code without unloading your LLM from VRAM
☆126Updated 2 months ago
codelion / pts
Pivotal Token Search
☆109Updated last week
SakanaAI / evo-memory
Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.
☆316Updated 8 months ago
em-llm / EM-LLM-model
☆215Updated 4 months ago
guidance-ai / llguidance
Super-fast Structured Outputs
☆334Updated last week
apple / ml-recurrent-drafter
☆214Updated 5 months ago
antimatter15 / reverse-engineering-gemma-3n
Reverse Engineering Gemma 3n: Google's New Edge-Optimized Language Model
☆225Updated last month
pyember / ember
☆188Updated 3 weeks ago
Infini-AI-Lab / UMbreLLa
LLM Inference on consumer devices
☆121Updated 4 months ago
kolinko / effort
An implementation of bucketMul LLM inference
☆220Updated last year
PaulPauls / llama3_interpretability_sae
A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and full…
☆620Updated 3 months ago
valine / NeuralFlow
Visualize the intermediate output of Mistral 7B
☆366Updated 5 months ago
Zyphra / Zamba2
PyTorch implementation of models from the Zamba2 series.
☆184Updated 5 months ago
jlscheerer / xtr-warp
XTR/WARP (SIGIR'25) is an extremely fast and accurate retrieval engine based on Stanford's ColBERTv2/PLAID and Google DeepMind's XTR.
☆145Updated 2 months ago
MinishLab / vicinity
Lightweight Nearest Neighbors with Flexible Backends
☆294Updated this week
OpenPipe / deductive-reasoning
Train your own SOTA deductive reasoning model
☆99Updated 4 months ago
jxmorris12 / cde
code for training & evaluating Contextual Document Embedding models
☆194Updated 2 months ago
PrimeIntellect-ai / prime-rl
prime-rl is a codebase for decentralized async RL training at scale
☆368Updated this week
magicproduct / hash-hop
Long context evaluation for large language models
☆220Updated 4 months ago
Foreseerr / TScale
☆196Updated 2 months ago
neurallambda / awesome-reasoning
a curated list of data for reasoning ai
☆136Updated 11 months ago
google / lmeval
☆215Updated 2 weeks ago
willccbb / mlx_parallm
Fast parallel LLM inference for MLX
☆198Updated last year
AI-Hypercomputer / RecML
☆186Updated this week
snowflakedb / ArcticTraining
ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)
☆165Updated this week
willkurt / token-explorer
A simple tool that let's you explore different possible paths that an LLM might sample.
☆175Updated 2 months ago
pytorch-labs / monarch
PyTorch Single Controller
☆325Updated this week
PrimeIntellect-ai / pi-quant
SIMD quantization kernels
☆73Updated this week
facebookresearch / fastgen
Simple high-throughput inference library
☆120Updated 2 months ago
eqimp / hogwild_llm
Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache
☆113Updated this week