cisco-open/pymultiworld

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/cisco-open/pymultiworld)

cisco-open / pymultiworld

A framework for PyTorch to enable fault management for collective communication libraries (CCL) such as NCCL

☆20

Alternatives and similar repositories for pymultiworld

Users that are interested in pymultiworld are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

samsja / pydantic_config
View on GitHub
Manage ML configuration with pydantic
☆16Mar 18, 2026Updated 4 months ago
PrimeIntellect-ai / smart-contracts
View on GitHub
Solidity contracts for the decentralized Prime Network protocol
☆26Jul 6, 2025Updated last year
fix-project / fix
View on GitHub
☆12Jun 10, 2026Updated last month
PrimeIntellect-ai / prime-iroh
View on GitHub
Asynchronous P2P communication backend for decentralized pipeline parallelism
☆46Updated this week
PrimeIntellect-ai / prime-vllm
View on GitHub
Modded vLLM to run pipeline parallelism over public networks
☆41May 20, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
eligotts / legos
View on GitHub
☆24Jan 22, 2026Updated 6 months ago
Aleph-Alpha-Research / scaling
View on GitHub
Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…
☆66Nov 18, 2025Updated 8 months ago
PrimeIntellect-ai / pi-quant
View on GitHub
SIMD quantization kernels
☆93May 29, 2026Updated 2 months ago
KorAP / Tokenizer-Evaluation
View on GitHub
Benchmark scripts for comparing different tokenizers and sentence segmenters of German
☆12Feb 27, 2023Updated 3 years ago
PrimeIntellect-ai / experiments-autonomous-speedrunning
View on GitHub
autonomous nanogpt optimizer speedrun
☆109May 14, 2026Updated 2 months ago
alexzhang13 / longcot-mini-rlm-results
View on GitHub
Storing the LongCoT-mini results for RLM(GPT-5.2)
☆20Apr 26, 2026Updated 3 months ago
PrimeIntellect-ai / genesys
View on GitHub
☆139Mar 20, 2025Updated last year
JannikSt / ibtop
View on GitHub
Real-time terminal monitor for InfiniBand networks - htop for high-speed interconnects
☆141Dec 30, 2025Updated 6 months ago
mlcommons / training_results_v0.6
View on GitHub
This repository contains the results and code for the MLPerf™ Training v0.6 benchmark.
☆42Jul 22, 2023Updated 3 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
soldni / springs
View on GitHub
A set of utilities to turn Dataclasses into useful configuration managers.
☆11Mar 27, 2024Updated 2 years ago
facebookresearch / HolisticTraceAnalysis
View on GitHub
A library to analyze PyTorch traces.
☆538May 29, 2026Updated 2 months ago
titu1994 / pyshac
View on GitHub
A Python library for the Sequential Halving and Classification algorithm
☆20Apr 13, 2022Updated 4 years ago
PrimeIntellect-ai / protocol
View on GitHub
peer-to-peer compute and intelligence network that enables decentralized AI development at scale
☆139Nov 10, 2025Updated 8 months ago
Jiaqi0602 / adversarial-attack-from-leakage
View on GitHub
From Gradient Leakage to Adversarial Attacks in Federated Learning
☆15Sep 21, 2021Updated 4 years ago
piotrpawlaczek / python-blacken
View on GitHub
A customisable GitHub action to check the style of Python code with black.
☆12May 20, 2024Updated 2 years ago
SampsonML / DiscoverPhysics
View on GitHub
☆17May 31, 2026Updated last month
SymbioticLab / Fluid
View on GitHub
A Generic Resource-Aware Hyperparameter Tuning Execution Engine
☆15Jan 8, 2022Updated 4 years ago
PrimeIntellect-ai / toploc
View on GitHub
TOPLOC: is a novel method for verifiable inference that enables users to verify that LLM providers are using the correct model configurat…
☆58Jul 13, 2026Updated 2 weeks ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
StanfordLegion / gasnet
View on GitHub
☆13Jun 2, 2026Updated last month
Aleph-Alpha / Alpha-MoE
View on GitHub
☆74Dec 10, 2025Updated 7 months ago
gapeng000 / hierarchical_kmeans
View on GitHub
An package for creating hierarchical k-means/k-means tree/vocabulary tree.
☆16Dec 26, 2016Updated 9 years ago
PrimeIntellect-ai / pccl
View on GitHub
PCCL (Prime Collective Communications Library) implements fault tolerant collective communications over IP
☆157Sep 12, 2025Updated 10 months ago
bloomberg / pytest-pystack
View on GitHub
Pytest plugin that runs PyStack on slow or hanging tests.
☆21Updated this week
cakeng / ASPEN
View on GitHub
This is the proof-of-concept CPU implementation of ASPEN used for the NeurIPS'23 paper ASPEN: Breaking Operator Barriers for Efficient Pa…
☆13Apr 4, 2024Updated 2 years ago
agentcooper / better-things-to-do
View on GitHub
https://chrome.google.com/webstore/detail/better-things-to-do/begggblpkegcnammjagcmplfnpopocla
☆14Feb 25, 2017Updated 9 years ago
simon-mo / vLLM-Benchmark
View on GitHub
☆33Apr 19, 2025Updated last year
apple / ml-dataset-decomposition
View on GitHub
Official repo of dataset-decomposition paper [NeurIPS 2024]
☆21Jan 8, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
AmberLJC / megatron-system-analysis
View on GitHub
☆17Jan 31, 2026Updated 5 months ago
KerfuffleV2 / rusty-ggml
View on GitHub
GGML bindings that aim to be idiomatic Rust rather than directly corresponding to the C/C++ interface
☆20Sep 25, 2023Updated 2 years ago
zakuro-ai / sakura
View on GitHub
Sakura is the ML library of the Zakuro framework. It provides asynchronous distributed training for Pytorch.
☆18Jul 16, 2026Updated last week
kamwoh / chirpy3d
View on GitHub
Chirpy3D: Part-Aware Multi-View Diffusion for Creative Fine-Grained Object Generation
☆30Jun 2, 2026Updated last month
josancamon19 / trace
View on GitHub
Trajectory Recording and Capture Environments
☆19Jan 24, 2026Updated 6 months ago
KuangjuX / AttnLink
View on GitHub
An experimental communicating attention kernel based on DeepEP.
☆34Jul 29, 2025Updated last year
efeslab / fiddler
View on GitHub
[ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration
☆267Nov 18, 2024Updated last year