BBuf/megatron-lm-parallel-group-playground

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/BBuf/megatron-lm-parallel-group-playground)

BBuf / megatron-lm-parallel-group-playground

☆16

Alternatives and similar repositories for megatron-lm-parallel-group-playground

Users that are interested in megatron-lm-parallel-group-playground are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Adlik / model_zoo
View on GitHub
☆11Dec 26, 2025Updated 4 months ago
billmuch / matmul_perf_test
View on GitHub
☆15Apr 15, 2022Updated 4 years ago
Oneflow-Inc / oneflow_convert
View on GitHub
OneFlow->ONNX
☆42Apr 19, 2023Updated 3 years ago
Oneflow-Inc / oneflow-xrt
View on GitHub
☆23Apr 25, 2023Updated 3 years ago
yester31 / Cutlass_EX
View on GitHub
study of cutlass
☆22Nov 10, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
flashinfer-ai / cubloaty
View on GitHub
a size profiler for cuda binary
☆70Jan 15, 2026Updated 3 months ago
tile-ai / AttentionEngine
View on GitHub
☆52May 19, 2025Updated 11 months ago
BBuf / tensorrt-llm-moe
View on GitHub
☆33Feb 3, 2025Updated last year
xdit-project / DiTCacheAnalysis
View on GitHub
An auxiliary project analysis of the characteristics of KV in DiT Attention.
☆34Nov 29, 2024Updated last year
Oneflow-Inc / one-fx
View on GitHub
A toolkit for developers to simplify the transformation of nn.Module instances. It's now corresponding to Pytorch.fx.
☆13Apr 7, 2023Updated 3 years ago
hudingding / TridentNet-mmdetection
View on GitHub
TridentNet in mmdetection
☆22Apr 2, 2020Updated 6 years ago
feifeibear / Odysseus-Transformer
View on GitHub
Odysseus: Playground of LLM Sequence Parallelism
☆78Jun 17, 2024Updated last year
alexeigor / sd-benchmarks
View on GitHub
Stable Diffusion inference benchmarks
☆10Jun 14, 2024Updated last year
meta-pytorch / kraken
View on GitHub
Triton-based Symmetric Memory operators and examples
☆98Mar 28, 2026Updated last month
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
Oneflow-Inc / one-yolov5
View on GitHub
A more efficient yolov5 with oneflow backend 🎉🎉🎉
☆215Jul 10, 2025Updated 9 months ago
FusionBolt / Rc-lang
View on GitHub
a simple programming language under development
☆11Dec 3, 2023Updated 2 years ago
ServiceNow / webarena-verified
View on GitHub
A verified version of the WebArena Benchmark
☆37Mar 8, 2026Updated 2 months ago
aiwaves-cn / Dive-into-LLMs
View on GitHub
The official github repo for the open online courses: "Dive into LLMs".
☆10Mar 15, 2024Updated 2 years ago
volcengine / veScale
View on GitHub
Byted PyTorch Distributed for Hyperscale Training of LLMs and RLs
☆1,010Mar 3, 2026Updated 2 months ago
ziplab / efficient-stable-diffusion
View on GitHub
☆16Sep 12, 2023Updated 2 years ago
YellowAndGreen / Yolov5-OpenCV-Cpp-Python-ROS
View on GitHub
Inference with YOLOv5, OpenCV 4.5.4 DNN, C++, ROS and Python
☆13Feb 12, 2023Updated 3 years ago
SkyworkAI / vllm
View on GitHub
A high-throughput and memory-efficient inference and serving engine for LLMs
☆17Jun 3, 2024Updated last year
catid / cuda_float_compress
View on GitHub
Python package for compressing floating-point PyTorch tensors
☆13Jul 22, 2024Updated last year
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
spencer-luo / common_util
View on GitHub
common util library for C++
☆12Apr 22, 2026Updated 2 weeks ago
L1aoXingyu / llm-infer-bench
View on GitHub
☆12Sep 1, 2023Updated 2 years ago
tlc-pack / libflash_attn
View on GitHub
Standalone Flash Attention v2 kernel without libtorch dependency
☆113Sep 10, 2024Updated last year
BBuf / oneflow-cifar
View on GitHub
☆13Mar 27, 2023Updated 3 years ago
IST-DASLab / gemm-fp8
View on GitHub
High Performance FP8 GEMM Kernels for SM89 and later GPUs.
☆21Jan 24, 2025Updated last year
Tencent / KsanaDiT
View on GitHub
KsanaDiT: High-Performance DiT (Diffusion Transformer) Inference Framework for Video & Image Generation
☆50Mar 30, 2026Updated last month
Oneflow-Inc / oneflow_face
View on GitHub
☆12Aug 10, 2022Updated 3 years ago
ayaka14732 / llama-jax
View on GitHub
JAX implementation of LLaMA, aiming to train LLaMA on Google Cloud TPU
☆14Jul 22, 2023Updated 2 years ago
caijixueIT / CUDA_Learning_for_Freshman
View on GitHub
☆14Nov 3, 2025Updated 6 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
yuandong-tian / understanding
View on GitHub
Understanding deep networks and large models.
☆28Jan 23, 2026Updated 3 months ago
triple-mu / HunyuanDiT-TensorRT-libtorch
View on GitHub
HunyuanDiT with TensorRT and libtorch
☆18May 22, 2024Updated last year
flagos-ai / libtriton_jit
View on GitHub
A Triton JIT runtime and ffi provider in C++
☆33Apr 28, 2026Updated last week
GeekGuru123 / ProfilingDiT
View on GitHub
☆20Jan 1, 2026Updated 4 months ago
hkust-nlp / RL-Verifier-Robustness
View on GitHub
From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.
☆25Oct 7, 2025Updated 7 months ago
MSiam / motion_adaptation
View on GitHub
☆11Jun 2, 2019Updated 6 years ago
lucidrains / esbn-transformer
View on GitHub
An attempt to merge ESBN with Transformers, to endow Transformers with the ability to emergently bind symbols
☆16Aug 3, 2021Updated 4 years ago