☆70Feb 10, 2025Updated last year
Alternatives and similar repositories for libfabric-efa-demo
Users that are interested in libfabric-efa-demo are comparing it to the libraries listed below
Sorting:
- ☆12May 30, 2025Updated 9 months ago
- ☆11Feb 17, 2026Updated 2 weeks ago
- ☆16Feb 27, 2026Updated last week
- ☆77Jan 5, 2025Updated last year
- Debug print operator for cudagraph debugging☆14Aug 2, 2024Updated last year
- Generic Async Semaphore☆21Apr 12, 2025Updated 10 months ago
- Triton kernels for Flux☆22Jul 7, 2025Updated 8 months ago
- ☆87Oct 17, 2025Updated 4 months ago
- Microsoft Collective Communication Library☆385Sep 20, 2023Updated 2 years ago
- Open Fabric Interfaces☆764Updated this week
- NVIDIA Inference Xfer Library (NIXL)☆898Feb 28, 2026Updated last week
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆20Oct 23, 2023Updated 2 years ago
- Perplexity GPU Kernels☆567Nov 7, 2025Updated 4 months ago
- Framework to reduce autotune overhead to zero for well known deployments.☆97Sep 19, 2025Updated 5 months ago
- ☆85Dec 17, 2025Updated 2 months ago
- pytorch ucc plugin☆23Jul 8, 2021Updated 4 years ago
- Scripts to customize AWS ParallelCluster☆28Sep 5, 2025Updated 6 months ago
- A low-latency & high-throughput serving engine for LLMs☆482Jan 8, 2026Updated last month
- Perplexity open source garden for inference technology☆371Dec 25, 2025Updated 2 months ago
- ☆160Dec 27, 2024Updated last year
- ☆34Feb 3, 2025Updated last year
- AWS DevOps for Docker - a sample project to help you build Docker containers and run them on AWS. In addition to running locally, this p…☆41May 27, 2021Updated 4 years ago
- NCCL Fast Socket is a transport layer plugin to improve NCCL collective communication performance on Google Cloud.☆122Nov 15, 2023Updated 2 years ago
- ☆87Jan 23, 2025Updated last year
- TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches☆80Jul 25, 2023Updated 2 years ago
- Writing FLUX in Triton☆42Sep 22, 2024Updated last year
- Create async web requests via Python in no time.☆11Jan 8, 2026Updated last month
- Repository for go shared libraries (for now).☆11Dec 1, 2025Updated 3 months ago
- Lustre Repository with MS patches☆13Feb 28, 2026Updated last week
- Distributed MoE in a Single Kernel [NeurIPS '25]☆194Feb 27, 2026Updated last week
- This repository contains the experimental PyTorch native float8 training UX☆226Aug 1, 2024Updated last year
- torchcomms: a modern PyTorch communications API☆344Updated this week
- NCCL Profiling Kit☆152Jul 1, 2024Updated last year
- ☆40Jul 26, 2024Updated last year
- ByteCheckpoint: An Unified Checkpointing Library for LFMs☆270Feb 2, 2026Updated last month
- Linux tree for ntrdma driver development.☆11Jun 29, 2017Updated 8 years ago
- Distributed lock backed by Dynamodb☆11Dec 7, 2023Updated 2 years ago
- This project is based on the [LTX-Video](https://github.com/Lightricks/LTX-Video) algorithm of the diffusers and optimized and accelerate…☆13Dec 31, 2024Updated last year
- Unofficial WhatsApp client with dark Onyx theme☆11Jan 26, 2020Updated 6 years ago