A PyTorch native platform for training generative AI models
☆16Nov 18, 2025Updated 4 months ago
Alternatives and similar repositories for torchtitan-amd
Users that are interested in torchtitan-amd are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Primus-SaFE(Stability and Fault Endurance)☆56Updated this week
- ☆14Sep 7, 2023Updated 2 years ago
- A Symbolic Emulator for Shuffle Synthesis on the NVIDIA PTX Code☆15Mar 19, 2023Updated 3 years ago
- setup.py cheatsheet☆16Sep 13, 2014Updated 11 years ago
- ☆12Apr 1, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- The goal of the OSSCI Fleet is to provide a central mechanism to enable test automation, batch job scheduling, and developer access to a …☆13Mar 24, 2026Updated 3 weeks ago
- ☆19Feb 25, 2026Updated last month
- GlotEval: a unified evaluation toolkit designed to benchmark multilingual Large Language Models (LLMs) in a language-specific way☆18Nov 4, 2025Updated 5 months ago
- Ahead of Time (AOT) Triton Math Library☆96Apr 8, 2026Updated last week
- WheelNext Website☆51Dec 19, 2025Updated 3 months ago
- Tristan-MP v2 [public]☆18Dec 29, 2024Updated last year
- Efficient kernel for RMS normalization with fused operations, includes both forward and backward passes, compatibility with PyTorch.☆13Jun 5, 2024Updated last year
- ☆20Jun 13, 2025Updated 10 months ago
- ☆10Mar 22, 2024Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- The official implementation of ImageBind-LLM and Whisper-LLM from the paper "Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Compre…☆21Oct 30, 2023Updated 2 years ago
- Code for benchmarking the speed of DeepSeek R1 from different providers' APIs.☆16Mar 21, 2025Updated last year
- CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark☆34Apr 9, 2026Updated last week
- A beautiful telnet/ssh client optimized for Mandarin BBS☆21Sep 8, 2009Updated 16 years ago
- ☆66Updated this week
- University of Zurich Master Thesis Template (Universität Zürich Masterarbeit Vorlage) in RMarkdown and Latex☆12Dec 8, 2021Updated 4 years ago
- ☆65Updated this week
- Parsing and analysis of IRC logs☆14Nov 30, 2018Updated 7 years ago
- Flexible Computational Science (FleCSI) Project☆25Updated this week
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- OMP4Py: a native Python implementation of OpenMP☆29Apr 6, 2026Updated last week
- 东北大学本科毕业设计 论文latex模板 适应2021届新版书写印制规范 针对计算机类专业☆11Apr 21, 2021Updated 4 years ago
- [DEPRECATED] Moved to ROCm/rocm-systems repo☆27Mar 31, 2026Updated 2 weeks ago
- ASCII back-end for matplotlib☆21Dec 25, 2015Updated 10 years ago
- An intelligent agent that understands your code and crafts perfect Git artifacts☆34Apr 9, 2026Updated last week
- Scientific Machine Learning Tutorials☆40Nov 20, 2021Updated 4 years ago
- Testing if I can implement slurm in an operator☆15Nov 3, 2024Updated last year
- A template and style file for generating reports / thesis in LaTeX. Specific for BITS-Pilani students☆19Mar 22, 2016Updated 10 years ago
- Random collections of my interested research papers / projects☆20May 20, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Ongoing research training transformer models at scale☆39Updated this week
- Official PyTorch implementation of CD-MOE☆12Mar 18, 2026Updated 3 weeks ago
- VGA LCD Core (OpenCores)☆15May 22, 2018Updated 7 years ago
- Comb is a communication performance benchmarking tool.☆26Feb 27, 2023Updated 3 years ago
- Light-weight real-time multi-object detection and tracking in Nvidia TX2☆10May 10, 2019Updated 6 years ago
- AMD RAD's multi-GPU Triton-based framework for seamless multi-GPU programming☆183Apr 9, 2026Updated last week
- Benchmark implementation of CosmoFlow in TensorFlow Keras☆22Feb 7, 2024Updated 2 years ago