valine / training-hot-swap
Pytorch script hot swap: Change code without unloading your LLM from VRAM
☆112Updated this week
Alternatives and similar repositories for training-hot-swap:
Users that are interested in training-hot-swap are comparing it to the libraries listed below
- Heirarchical Navigable Small Worlds☆96Updated 2 weeks ago
- ☆242Updated last year
- ☆46Updated 3 weeks ago
- An implementation of bucketMul LLM inference☆216Updated 9 months ago
- DiscoGrad - automatically differentiate across conditional branches in C++ programs☆202Updated 7 months ago
- Dead Simple LLM Abliteration☆211Updated 2 months ago
- look how they massacred my boy☆63Updated 6 months ago
- A multi-player tournament benchmark that tests LLMs in social reasoning, strategy, and deception. Players engage in public and private co…☆253Updated 2 weeks ago
- PyTorch implementation of models from the Zamba2 series.☆179Updated 3 months ago
- NanoGPT-speedrunning for the poor T4 enjoyers☆62Updated this week
- Bayesian Optimization as a Coverage Tool for Evaluating LLMs. Accurate evaluation (benchmarking) that's 10 times faster with just a few l…☆282Updated last week
- Mistral7B playing DOOM☆131Updated 9 months ago
- A playground to make it easy to try crazy things☆33Updated this week
- A pure NumPy implementation of Mamba.☆222Updated 9 months ago
- Autograd to GPT-2 completely from scratch☆112Updated this week
- ☆163Updated 11 months ago
- Visualizing the internal board state of a GPT trained on chess PGN strings, and performing interventions on its internal board state and …☆204Updated 5 months ago
- A library for incremental loading of large PyTorch checkpoints☆56Updated 2 years ago
- A copy of ONNX models, datasets, and code all in one GitHub repository. Follow the README to learn more.☆105Updated last year
- Live-bending a foundation model’s output at neural network level.☆241Updated 2 weeks ago
- Lightweight Nearest Neighbors with Flexible Backends☆267Updated last month
- Absolute minimalistic implementation of a GPT-like transformer using only numpy (<650 lines).☆250Updated last year
- Run and explore Llama models locally with minimal dependencies on CPU☆189Updated 6 months ago
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆304Updated 6 months ago
- A GPU Accelerated Binary Vector Store☆47Updated 2 months ago
- Alice in Wonderland code base for experiments and raw experiments data☆129Updated 2 months ago
- Bridging the Gap Between Semantic and Interaction Similarity in Recommender Systems☆99Updated 3 weeks ago
- a curated list of data for reasoning ai☆134Updated 8 months ago
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆126Updated 4 months ago
- Lightweight Pandas monkey-patch that adds async support to map, apply, applymap, aggregate, and transform, enabling seamless handling of …☆125Updated last month