kooyunmo / cuda-uvm-gpt2View external linksLinks
PyTorch-UVM on super-large language models.
☆17Dec 21, 2020Updated 5 years ago
Alternatives and similar repositories for cuda-uvm-gpt2
Users that are interested in cuda-uvm-gpt2 are comparing it to the libraries listed below
Sorting:
- Tensors and Dynamic neural networks in Python with strong GPU acceleration☆15Dec 21, 2020Updated 5 years ago
- ☆26Aug 19, 2022Updated 3 years ago
- ☆81Nov 16, 2020Updated 5 years ago
- Pannotia v0.9 is a suite of OpenCL graph applications☆24Sep 13, 2017Updated 8 years ago
- HeteroSync is a benchmark suite for performing fine-grained synchronization on tightly coupled GPUs☆31Sep 19, 2024Updated last year
- A fast, accurate, and easy-to-integrate memory simulator that model memory system performance with bandwidth--latency curves.☆33Oct 18, 2025Updated 3 months ago
- ☆33Sep 9, 2020Updated 5 years ago
- The SEAL-CPU backend is a Reference backend engine for HEBench which is a shared library that implements the required functions specified…☆11Mar 3, 2023Updated 2 years ago
- This serves as a repository for reproducibility of the SC21 paper "In-Depth Analyses of Unified Virtual Memory System for GPU Accelerated…☆39Sep 25, 2023Updated 2 years ago
- DELTA-pytorch:DELTA: Dynamically Optimizing GPU Memory beyond Tensor Recomputation☆12Apr 16, 2024Updated last year
- Code for ICML21 paper "Learning Self-Modulating Attention in Continuous Time Space with Applications to Sequential Recommendation"☆12Feb 8, 2023Updated 3 years ago
- ☆12Apr 30, 2024Updated last year
- CR-LT KGQA Dataset Repository☆11Jun 1, 2025Updated 8 months ago
- An EDM-enabled PHY + a rack-level network simulator☆12Dec 11, 2024Updated last year
- Memory management simulator, using Hashed Page Table. Page Replacement Algorithms: Least Recently Used (LRU) and Second Chance.☆10Apr 12, 2021Updated 4 years ago
- Verilog RTL Implementation of DNN☆10Jun 26, 2018Updated 7 years ago
- ☆11Jan 5, 2022Updated 4 years ago
- Secure Inference Resilient Against Malicious Clients☆15May 3, 2022Updated 3 years ago
- SGQuant: Squeezing the Last Bit on Graph Neural Networks with Specialized Quantization☆11Aug 12, 2020Updated 5 years ago
- Repo for PyChart 1.39, refs http://download.gna.org/pychart/☆10Sep 29, 2014Updated 11 years ago
- A project that patch the xiaomi linux system which can connect to chatGPT with WebRTC and Websocket☆10Aug 29, 2025Updated 5 months ago
- The algorithms for multilevel evaluation of balance in signed directed networks☆10Jul 4, 2024Updated last year
- ☆14Dec 13, 2023Updated 2 years ago
- Slowdown prediction module of Echo: Simulating Distributed Training at Scale☆13May 17, 2025Updated 8 months ago
- Multiple 1-stencil implementations using nvidia cuda.☆13Dec 2, 2017Updated 8 years ago
- ☆11Jul 2, 2024Updated last year
- GVProf: A Value Profiler for GPU-based Clusters☆52Mar 24, 2024Updated last year
- ☆12Apr 3, 2024Updated last year
- Implementation of 1D, 2D, and 3D FFT convolutions in PyTorch. Much faster than direct convolutions for large kernel sizes.☆13Jul 9, 2023Updated 2 years ago
- ☆10Feb 11, 2023Updated 3 years ago
- ☆13Oct 6, 2024Updated last year
- ☆14Feb 5, 2025Updated last year
- A repository for code used in the paper "On the precision loss in approximate homomorphic encryption"☆10Jan 16, 2025Updated last year
- including compiler to encode DGL GNN model to instructions, runtime software to transfer data and control the accelerator, and hardware v…☆14Nov 19, 2023Updated 2 years ago
- Repo to hold HammerBlade PyTorch port. Based on PyTorch v1.4.0☆14Oct 4, 2022Updated 3 years ago
- (elastic) cuckoo hashing☆15Jun 20, 2020Updated 5 years ago
- LaTeX template for dissertation proposals in Peking University Shenzhen.☆15Feb 23, 2022Updated 3 years ago
- Source code for paper "Prioritized Restreaming Algorithms for Balanced Graph Partitioning".☆14Dec 31, 2024Updated last year
- ☆12Oct 25, 2022Updated 3 years ago