☆16Mar 30, 2024Updated last year
Alternatives and similar repositories for megatron-lm-parallel-group-playground
Users that are interested in megatron-lm-parallel-group-playground are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆11Dec 26, 2025Updated 3 months ago
- ☆15Apr 15, 2022Updated 3 years ago
- OneFlow->ONNX☆43Apr 19, 2023Updated 2 years ago
- OneFlow Serving☆20Apr 10, 2025Updated 11 months ago
- ☆23Apr 25, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- study of cutlass☆22Nov 10, 2024Updated last year
- a size profiler for cuda binary☆71Jan 15, 2026Updated 2 months ago
- ☆52May 19, 2025Updated 10 months ago
- ☆34Feb 3, 2025Updated last year
- An auxiliary project analysis of the characteristics of KV in DiT Attention.☆34Nov 29, 2024Updated last year
- ☆12Mar 13, 2023Updated 3 years ago
- A toolkit for developers to simplify the transformation of nn.Module instances. It's now corresponding to Pytorch.fx.☆13Apr 7, 2023Updated 2 years ago
- AlgorithmNote is a knowledge sharing github page, mainly has three parts: algorithm, engineering and basic knowledge.☆14Feb 17, 2015Updated 11 years ago
- Odysseus: Playground of LLM Sequence Parallelism☆79Jun 17, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Stable Diffusion inference benchmarks☆10Jun 14, 2024Updated last year
- A more efficient yolov5 with oneflow backend 🎉🎉🎉☆215Jul 10, 2025Updated 8 months ago
- a simple programming language under development☆11Dec 3, 2023Updated 2 years ago
- A forked version of flux-fast that makes flux-fast even faster with cache-dit, 3.3x speedup on NVIDIA L20.☆24Jul 18, 2025Updated 8 months ago
- The official github repo for the open online courses: "Dive into LLMs".☆10Mar 15, 2024Updated 2 years ago
- Document the demo and a series of documents for learning the diffusion model.☆41Jun 29, 2023Updated 2 years ago
- 一个开源数学大模型项目,旨在探索大模型是否具有数学创造能力,以及大模型在前沿数学研究中的潜在能力。☆18Mar 19, 2026Updated last week
- Byted PyTorch Distributed for Hyperscale Training of LLMs and RLs☆1,000Mar 3, 2026Updated 3 weeks ago
- In our implementation of Qwen-Image-Edit, we employ block causal attention to improve inference speed.☆45Feb 16, 2026Updated last month
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆16Sep 12, 2023Updated 2 years ago
- Inference with YOLOv5, OpenCV 4.5.4 DNN, C++, ROS and Python☆13Feb 12, 2023Updated 3 years ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆17Jun 3, 2024Updated last year
- Utility scripts for PyTorch (e.g. Make Perfetto show some disappearing kernels, Memory profiler that understands more low-level allocatio…☆94Sep 11, 2025Updated 6 months ago
- common util library for C++☆12Feb 28, 2026Updated last month
- Standalone Flash Attention v2 kernel without libtorch dependency☆113Sep 10, 2024Updated last year
- 首届中国心电智能大赛决赛阶段解决方案-公开版 比赛网址 http://mdi.ids.tsinghua.edu.cn/☆10Aug 21, 2019Updated 6 years ago
- ☆13Mar 27, 2023Updated 3 years ago
- High Performance FP8 GEMM Kernels for SM89 and later GPUs.☆20Jan 24, 2025Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- KsanaDiT: High-Performance DiT (Diffusion Transformer) Inference Framework for Video & Image Generation☆46Mar 6, 2026Updated 3 weeks ago
- A PyTorch implementation of computing mean average precision in parallel☆16Jul 7, 2022Updated 3 years ago
- 使用自然语言绘制流程图,基于OpenAI☆12Nov 13, 2023Updated 2 years ago
- My notes about mathematics.☆19Mar 5, 2026Updated 3 weeks ago
- optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052☆478Mar 15, 2024Updated 2 years ago
- ☆12Aug 10, 2022Updated 3 years ago
- This is the official repo for the paper "Accelerating Parallel Sampling of Diffusion Models" Tang et al. ICML 2024 https://openreview.net…☆16Jul 19, 2024Updated last year