Example of binding a TF32 CUTLASS GEMM kernel to PyTorch
☆12Jun 7, 2024Updated last year
Alternatives and similar repositories for tf32_gemm
Users that are interested in tf32_gemm are comparing it to the libraries listed below
Sorting:
- Inference Llama 2 with a model compiled to native code by TorchInductor☆14Feb 8, 2024Updated 2 years ago
- ☆16Sep 24, 2024Updated last year
- extensible collectives library in triton☆95Mar 31, 2025Updated 11 months ago
- Disable YubiKey output on MacOS without a modifier key pressed☆10Aug 10, 2022Updated 3 years ago
- access ChatGPT/Gemini/Claude from Emacs without APIs☆10Dec 25, 2025Updated 2 months ago
- Triton-based Symmetric Memory operators and examples☆85Jan 15, 2026Updated last month
- ☆20May 24, 2025Updated 9 months ago
- A Python Library for the 3GPP physical layer☆14Dec 18, 2025Updated 2 months ago
- Templates for commonly used GitHub actions steps☆13Dec 13, 2024Updated last year
- PyTorch Lightning based framework to run experiments for self-supervised learning tasks.☆10Feb 14, 2020Updated 6 years ago
- 1st Place Team Crane: @aswinkumar1999 @rathull @kyolebu☆29Sep 8, 2025Updated 5 months ago
- This repository consists of useful tools or guides for system software development or anything interesting.☆11Updated this week
- ☆12Aug 26, 2025Updated 6 months ago
- Efficient-Tensor-Management-on-HM-for-Deep-Learning☆10Nov 15, 2021Updated 4 years ago
- Analysis of the MovieLens dataset of movie ratings and reviews.☆11Sep 2, 2018Updated 7 years ago
- ☆15Mar 30, 2024Updated last year
- Mirroring displays on Linux☆13Aug 22, 2016Updated 9 years ago
- A perl script for searching and replacing in mathematics in LaTeX documents.☆13Jul 21, 2021Updated 4 years ago
- A tool for visualization of complex job searches.☆13Jul 8, 2022Updated 3 years ago
- ☆15Mar 26, 2025Updated 11 months ago
- KANs and MLPs☆12Jun 7, 2024Updated last year
- Funding schemes and travel grant opportunities for postdocs☆10Jun 2, 2018Updated 7 years ago
- Pairwise Controlled Manifold Approximation (PaCMAP) for dimensionality reduction☆20Feb 3, 2026Updated 3 weeks ago
- "Not too complicated" training code for CIFAR-10 by PyTorch Lightning☆12Jun 5, 2022Updated 3 years ago
- A fast implementation of log() and exp()☆57Dec 14, 2022Updated 3 years ago
- A Collection of GitHub Profiles with awesome readme☆14Aug 17, 2023Updated 2 years ago
- ☆16Jun 15, 2023Updated 2 years ago
- ☆12Jul 6, 2017Updated 8 years ago
- ☆16Feb 23, 2021Updated 5 years ago
- ☆15Dec 29, 2022Updated 3 years ago
- Unofficial mirror of pdftk - imported using git-ubuntu☆10Aug 20, 2018Updated 7 years ago
- Simple repository contribution statistics☆15Updated this week
- Ongoing research training transformer models at scale☆18Updated this week
- GeomScale in Google Summer of Code 2020☆13Jan 14, 2020Updated 6 years ago
- Intelligent Resource Requirement Estimation and Scheduling for Deep Learning Jobs on Distributed GPU Clusters☆15Nov 18, 2021Updated 4 years ago
- ☆15Aug 3, 2021Updated 4 years ago
- GPULZ: Optimizing LZSS Lossless Compression for Multi-byte Data on Modern GPUs☆16Apr 18, 2025Updated 10 months ago
- Collection of scripts to build PyTorch and the domain libraries from source.☆13Feb 4, 2026Updated 3 weeks ago
- Monitor processes and parallel workloads for hangs☆16Dec 27, 2019Updated 6 years ago