feifeibear/PSTensor

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/feifeibear/PSTensor)

feifeibear / PSTensor

PSTensor provides a way to hack the memory management of tensors in TensorFlow and PyTorch by defining your own C++ Tensor Class.

☆10

Alternatives and similar repositories for PSTensor

Users that are interested in PSTensor are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

feifeibear / PyTorchMemTracer
View on GitHub
Depict GPU memory footprint during DNN training of PyTorch
☆11Nov 17, 2022Updated 3 years ago
zhuzilin / pytorch-malloc
View on GitHub
An external memory allocator example for PyTorch.
☆16Aug 10, 2025Updated 11 months ago
zhuzilin / chatgpt-desktop
View on GitHub
Desktop version of ChatGPT, support manually set cookie
☆19Dec 9, 2022Updated 3 years ago
uwsampl / dtr-prototype
View on GitHub
Dynamic Tensor Rematerialization prototype (modified PyTorch) and simulator. Paper: https://arxiv.org/abs/2006.09616
☆133Jul 6, 2023Updated 3 years ago
pku-minic / next-gen-ir-proposal
View on GitHub
Proposal for the next generation of course-oriented IR.
☆10Dec 24, 2021Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
vllm-project / vllm-nccl
View on GitHub
Manages vllm-nccl dependency
☆18Jun 3, 2024Updated 2 years ago
S-Lab-System-Group / Primo
View on GitHub
Primo: Practical Learning-Augmented Systems with Interpretable Models
☆19Dec 26, 2023Updated 2 years ago
ryantd / veloce
View on GitHub
WIP. Veloce is a low-code Ray-based parallelization library that makes machine learning computation novel, efficient, and heterogeneous.
☆17Aug 4, 2022Updated 3 years ago
hpcaitech / Elixir
View on GitHub
Elixir: Train a Large Language Model on a Small GPU Cluster
☆16Jun 8, 2023Updated 3 years ago
S-Lab-System-Group / ChronusArtifact
View on GitHub
☆23Jan 7, 2022Updated 4 years ago
hpcaitech / CachedEmbedding
View on GitHub
A memory efficient DLRM training solution using ColossalAI
☆108Nov 22, 2022Updated 3 years ago
S-Lab-System-Group / Hydro
View on GitHub
Surrogate-based Hyperparameter Tuning System
☆30Jun 29, 2023Updated 3 years ago
SJTU-IPADS / PhoenixOS-Remoting
View on GitHub
☆21Jul 10, 2025Updated last year
SJTU-IPADS / fgnn-artifacts
View on GitHub
FGNN's artifact evaluation (EuroSys 2022)
☆18Apr 25, 2022Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
alibaba / GPU-scheduler-for-deep-learning
View on GitHub
GPU-scheduler-for-deep-learning
☆214Nov 5, 2020Updated 5 years ago
xdit-project / DiTCacheAnalysis
View on GitHub
An auxiliary project analysis of the characteristics of KV in DiT Attention.
☆34Nov 29, 2024Updated last year
hpcaitech / GPT-Demo
View on GitHub
GPT Demo with hybrid distributed training
☆10Dec 1, 2022Updated 3 years ago
xmcp / pku-eutopia
View on GitHub
兆京大学班车预约 for Humans™
☆32Mar 19, 2026Updated 4 months ago
promoe-opensource / promoe
View on GitHub
☆20Jan 27, 2025Updated last year
dose78 / CARMA
View on GitHub
Communication-Avoiding Recursive Matrix Multiply
☆17Jul 10, 2013Updated 13 years ago
casys-kaist / HUVM
View on GitHub
☆27Aug 19, 2022Updated 3 years ago
weifengliu-ssslab / Benchmark_SpTRSM_using_CSC
View on GitHub
Fast Synchronization-Free Algorithms for Parallel Sparse Triangular Solves with Multiple Right-Hand Sides (SpTRSM)
☆17Feb 14, 2020Updated 6 years ago
AD1024 / veripy
View on GitHub
Python3 auto-active verification library (migrated to an Intel project)
☆24Apr 7, 2022Updated 4 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
TurboNLP / Translate-Demo
View on GitHub
A Translation Task using TurboTransformers
☆10Dec 17, 2020Updated 5 years ago
Olament / Hanzi2PinyinEngine
View on GitHub
Hanzi to Pinyin engine in Swift 拼音输入法引擎
☆13Mar 29, 2024Updated 2 years ago
gaocegege / xuruowei-forever
View on GitHub
https://xuruowei.com 是她的家人朋友们和她的爱人高策为纪念她留下的。徐若薇于 2026 年 2 月 28 日离世。我们希望通过这个时间线纪念她的一生——照片、故事、文字、音乐与她钟爱的一切。沿着她生命的轨迹漫步，重新触摸那些有温度的瞬间。
☆28Apr 1, 2026Updated 3 months ago
jianweif / OptimalGradCheckpointing
View on GitHub
☆41Jun 18, 2021Updated 5 years ago
huawei-cloudnative / firmament
View on GitHub
The Firmament cluster scheduling platform
☆19Mar 15, 2019Updated 7 years ago
jkehne / cuda-malloc-hook
View on GitHub
Drop-in library for tracking the memory allocations of CUDA applications
☆14Nov 17, 2017Updated 8 years ago
hpcaitech / ColossalAI-Benchmark
View on GitHub
Performance benchmarking with ColossalAI
☆39Jul 6, 2022Updated 4 years ago
spcl / substation
View on GitHub
Research and development for optimizing transformers
☆132Feb 16, 2021Updated 5 years ago
feifeibear / SeeReel
View on GitHub
Agent-native Seedance 2.0 short-film studio: cli for AI, canvas for human
☆15Jun 14, 2026Updated last month
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
RayeRen / RayeRen
View on GitHub
☆11Apr 7, 2026Updated 3 months ago
YukeWang96 / MGG_OSDI23
View on GitHub
Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Mult…
☆40Mar 17, 2024Updated 2 years ago
hpcaitech / SkyComputing
View on GitHub
Sky Computing: Accelerating Geo-distributed Computing in Federated Learning
☆90Nov 22, 2022Updated 3 years ago
olcf / NVIDIA-tensor-core-examples
View on GitHub
☆20Nov 7, 2019Updated 6 years ago
microsoft / tensorflow-rematerialization
View on GitHub
Implementation of a Tensorflow XLA rematerialization pass
☆15Dec 20, 2019Updated 6 years ago
SymbioticLab / Tiresias
View on GitHub
Tiresias is a GPU cluster manager for distributed deep learning training.
☆166May 7, 2020Updated 6 years ago
dywsjtu / apparate
View on GitHub
Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]
☆24Nov 21, 2024Updated last year