valine / training-hot-swapView external linksLinks
Pytorch script hot swap: Change code without unloading your LLM from VRAM
☆125Apr 21, 2025Updated 9 months ago
Alternatives and similar repositories for training-hot-swap
Users that are interested in training-hot-swap are comparing it to the libraries listed below
Sorting:
- Generic representation and manipulation of abstract syntax☆27May 26, 2022Updated 3 years ago
- NanoGPT-speedrunning for the poor T4 enjoyers☆73Apr 22, 2025Updated 9 months ago
- Tensor library & inference framework for machine learning☆117Oct 3, 2025Updated 4 months ago
- Code implementing "Efficient Parallelization of a Ubiquitious Sequential Computation" (Heinsen, 2023)☆98Dec 5, 2024Updated last year
- Various test models in WNNX format. It can view with `pip install wnetron && wnetron`☆12Jun 22, 2022Updated 3 years ago
- ☆11Apr 30, 2025Updated 9 months ago
- Official implementation for the AAAI2025 paper "PIXELS - Progressive Image Xemplar-based Editing with Latent Surgery"☆11Dec 17, 2024Updated last year
- Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"☆18Mar 15, 2024Updated last year
- A flat container abstraction for Rust☆16Nov 24, 2025Updated 2 months ago
- CECS 342 Lab 4: Logic Languages with SWI-Prolog☆13Nov 19, 2021Updated 4 years ago
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- This is a python implementation for stitching images.☆231Oct 3, 2024Updated last year
- A browser-based, WebGL2 implementation of GPT-2 with transform block and attention matrix visualization☆344Oct 24, 2025Updated 3 months ago
- ☆36Feb 6, 2026Updated last week
- An AI character interaction system with emotional modeling and advanced memory management☆17Oct 26, 2024Updated last year
- Recording and thinking when read the paper about PersonReID.☆10Jan 10, 2019Updated 7 years ago
- Codebase for Math Neurosurgery: Isolating LLMs' Math Reasoning Abilities Using Only Forward Passes☆21Jun 15, 2025Updated 8 months ago
- A minimal tensor processing unit (TPU), inspired by Google's TPU V2 and V1☆1,164Aug 21, 2025Updated 5 months ago
- Efficient optimizers☆283Dec 20, 2025Updated last month
- ☆40Dec 1, 2022Updated 3 years ago
- ☆36Mar 20, 2024Updated last year
- Writing FLUX in Triton☆41Sep 22, 2024Updated last year
- Code to go along with Separating Axis Test blog☆59Jul 19, 2025Updated 6 months ago
- Prompts and evaluation data for LLMs on real world coding and writing tasks☆16Sep 13, 2025Updated 5 months ago
- ☆16Nov 21, 2017Updated 8 years ago
- Triton kernels for Flux☆22Jul 7, 2025Updated 7 months ago
- [ECCV 2024] BeyondScene: Higher-Resolution Human-Centric Scene Generation With Pretrained Diffusion☆21Jul 2, 2024Updated last year
- ☆18Apr 10, 2023Updated 2 years ago
- Simple function for local patch extraction from OpenCV keypoints.☆14Jul 10, 2019Updated 6 years ago
- Animating R1's thoughts.☆384Feb 17, 2025Updated 11 months ago
- a naive 3d human pose editor GUI.☆20Jul 12, 2023Updated 2 years ago
- A love2d module that enables hot code reloading☆15Sep 5, 2017Updated 8 years ago
- ☆17Apr 14, 2023Updated 2 years ago
- A dashboard for exploring timm learning rate schedulers☆19Nov 22, 2024Updated last year
- Supercharge huggingface transformers with model parallelism.☆78Jul 23, 2025Updated 6 months ago
- Code to test the raw overhead of a JNI call, as opposed to calling the method from C☆19Feb 18, 2011Updated 14 years ago
- Live image description solution using ESP32-CAM + Phone + Server☆44Jan 4, 2025Updated last year
- Focused on fast experimentation and simplicity☆80Dec 24, 2024Updated last year
- A demo for the Direct Ascent Synthesis: Hidden Generative Capabilities in Discriminative Models paper (https://arxiv.org/abs/2502.07753)☆41Mar 5, 2025Updated 11 months ago