fzyzcjy/torch_memory_saver

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/fzyzcjy/torch_memory_saver)

fzyzcjy / torch_memory_saver

Allow torch tensor memory to be released and resumed later

☆261

Alternatives and similar repositories for torch_memory_saver

Users that are interested in torch_memory_saver are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

fzyzcjy / torch_utils
View on GitHub
Utility scripts for PyTorch (e.g. Make Perfetto show some disappearing kernels, Memory profiler that understands more low-level allocatio…
☆114Sep 11, 2025Updated 10 months ago
ISEEKYAN / mbridge
View on GitHub
Bridge Megatron-Core to Hugging Face/Reinforcement Learning
☆228Jun 15, 2026Updated last month
ByteDance-Seed / Triton-distributed
View on GitHub
Distributed Compiler based on Triton for Parallel Systems
☆1,503Jul 20, 2026Updated last week
MoonshotAI / checkpoint-engine
View on GitHub
Checkpoint-engine is a simple middleware to update model weights in LLM inference engines
☆984Jul 4, 2026Updated 3 weeks ago
inclusionAI / asystem-amem
View on GitHub
A NCCL extension library, designed to efficiently offload GPU memory allocated by the NCCL communication library.
☆113Dec 17, 2025Updated 7 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
yifuwang / symm-mem-recipes
View on GitHub
☆170Dec 27, 2024Updated last year
TransferQueue / TransferQueue
View on GitHub
[Archived] For the latest updates and community contribution, please visit: https://github.com/Ascend/TransferQueue or https://gitcode.co…
☆16Jan 16, 2026Updated 6 months ago
Dao-AILab / sonic-moe
View on GitHub
Accelerating MoE with IO and Tile-aware Optimizations
☆732Jul 4, 2026Updated 3 weeks ago
Dao-AILab / quack
View on GitHub
A Quirky Assortment of CuTe Kernels
☆1,076Updated this week
radixark / miles
View on GitHub
Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.
☆1,805Updated this week
perplexityai / pplx-kernels
View on GitHub
Perplexity GPU Kernels
☆595Nov 7, 2025Updated 8 months ago
RLsys-Foundation / TritonForge
View on GitHub
🔥 LLM-powered GPU kernel synthesis: Train models to convert PyTorch ops into optimized Triton kernels via SFT+RL. Multi-turn compilation…
☆146Nov 10, 2025Updated 8 months ago
Victarry / PP-Schedule-Visualization
View on GitHub
Pipeline Parallelism Emulation and Visualization
☆85Jun 30, 2026Updated 3 weeks ago
stepfun-ai / StepMesh
View on GitHub
☆380Jan 28, 2026Updated 6 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
bytedance / flux
View on GitHub
A fast communication-overlapping library for tensor/expert parallelism on GPUs.
☆1,348Aug 28, 2025Updated 11 months ago
flashinfer-ai / flashinfer
View on GitHub
FlashInfer: Kernel Library for LLM Serving
☆6,053Updated this week
infinigence / HamiltonAttention
View on GitHub
☆45Oct 15, 2025Updated 9 months ago
SandAI-org / MagiAttention
View on GitHub
A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training
☆891Updated this week
zhuzilin / ring-flash-attention
View on GitHub
Ring attention implementation with flash attention
☆1,038Sep 10, 2025Updated 10 months ago
chenyu-jiang / dcp
View on GitHub
Code repository for the SOSP'25 paper DCP: Addressing Input Dynamism In Long-Context Training via Dynamic Context Parallelism.
☆21Nov 28, 2025Updated 8 months ago
meta-pytorch / torchcomms
View on GitHub
torchcomms: a modern PyTorch communications API
☆381Updated this week
kvcache-ai / Mooncake
View on GitHub
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
☆6,063Updated this week
leepoly / sm-profiler
View on GitHub
☆84Feb 5, 2026Updated 5 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
sgl-project / SpecForge
View on GitHub
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
☆1,018Updated this week
zhaochenyang20 / Awesome-ML-SYS-Tutorial
View on GitHub
My learning notes for ML SYS.
☆6,782Updated this week
yzlnew / infra-skills
View on GitHub
A collection of specialized agent skills for AI infrastructure development, enabling Claude Code to write, optimize, and debug high-perfo…
☆140Jul 9, 2026Updated 2 weeks ago
THUDM / slime
View on GitHub
slime is an LLM post-training framework for RL Scaling.
☆7,679Updated this week
Victarry / PyTorch-Memory-Profiler
View on GitHub
☆47Sep 8, 2025Updated 10 months ago
infinigence / FlashOverlap
View on GitHub
A lightweight design for computation-communication overlap.
☆243Jan 20, 2026Updated 6 months ago
flashinfer-ai / cutlass-viz
View on GitHub
☆65Apr 26, 2025Updated last year
NVIDIA / TransformerEngine
View on GitHub
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on H…
☆3,451Updated this week
NVIDIA / nvshmem
View on GitHub
NVIDIA NVSHMEM is a parallel programming interface for NVIDIA GPUs based on OpenSHMEM. NVSHMEM can significantly reduce multi-process com…
☆566Jul 20, 2026Updated last week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
tile-ai / tilelang
View on GitHub
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
☆7,007Updated this week
simveit / effective_transpose
View on GitHub
Effective transpose on Hopper GPU
☆29Sep 6, 2025Updated 10 months ago
zhuzilin / flash-attention-with-sink
View on GitHub
☆37Aug 7, 2025Updated 11 months ago
KuangjuX / NVSHMEM-Tutorial
View on GitHub
NVSHMEM‑Tutorial: Build a DeepEP‑like GPU Buffer
☆195Feb 11, 2026Updated 5 months ago
sgl-project / sgl-learning-materials
View on GitHub
Materials for learning SGLang
☆861Jan 5, 2026Updated 6 months ago
ai-dynamo / nixl
View on GitHub
NVIDIA Inference Xfer Library (NIXL)
☆1,155Updated this week
tile-ai / TileRT
View on GitHub
Tile-Based Runtime for Ultra-Low-Latency LLM Inference
☆1,596Jul 14, 2026Updated 2 weeks ago