☆72Feb 10, 2025Updated last year
Alternatives and similar repositories for libfabric-efa-demo
Users that are interested in libfabric-efa-demo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆12May 30, 2025Updated 10 months ago
- ☆11Feb 17, 2026Updated last month
- these are custom recipes of nvidia nsight system post collection analysis.☆16Nov 7, 2025Updated 5 months ago
- This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.☆210Updated this week
- ☆16Apr 7, 2026Updated last week
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Managed collective communication service☆24Sep 2, 2024Updated last year
- ☆81Jan 5, 2025Updated last year
- Open Fabric Interfaces☆780Updated this week
- Debug print operator for cudagraph debugging☆14Aug 2, 2024Updated last year
- Perplexity GPU Kernels☆565Nov 7, 2025Updated 5 months ago
- AWS DevOps for Docker - a sample project to help you build Docker containers and run them on AWS. In addition to running locally, this p…☆41May 27, 2021Updated 4 years ago
- Microsoft Collective Communication Library☆389Sep 20, 2023Updated 2 years ago
- ☆85Dec 17, 2025Updated 3 months ago
- NVIDIA Inference Xfer Library (NIXL)☆970Updated this week
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Lustre Repository with MS patches☆15Updated this week
- Build and run container environment for LFRic☆10Jan 8, 2024Updated 2 years ago
- Kubernetes CSI Driver for serving OCI model artifacts☆25Mar 23, 2026Updated 3 weeks ago
- pytorch ucc plugin☆23Jul 8, 2021Updated 4 years ago
- Perplexity open source garden for inference technology☆390Dec 25, 2025Updated 3 months ago
- Openfold inference architecture for Amazon EKS☆11Oct 1, 2024Updated last year
- TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches☆80Jul 25, 2023Updated 2 years ago
- Lustre diagnostic tools for running Lustre in Azure☆10Apr 17, 2024Updated last year
- ☆166Dec 27, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- This Guidance demonstrates how to deploy a machine learning inference architecture on Amazon Elastic Kubernetes Service (Amazon EKS). It …☆46May 29, 2025Updated 10 months ago
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆20Oct 23, 2023Updated 2 years ago
- Cray Lustre is HPE's curated Lustre distro for HPE ClusterStor, Cray EX, and other HPE/Cray clients☆18Updated this week
- Implementation of M4 in Python☆10Dec 4, 2022Updated 3 years ago
- NCCL Profiling Kit☆152Jul 1, 2024Updated last year
- A low-latency & high-throughput serving engine for LLMs☆490Jan 8, 2026Updated 3 months ago
- Reduction Server in Rust☆14Apr 9, 2024Updated 2 years ago
- Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.☆70Mar 20, 2025Updated last year
- Convert Travis.yml to GitHub Actions workflows.☆12Aug 20, 2022Updated 3 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Linux tree for ntrdma driver development.☆11Jun 29, 2017Updated 8 years ago
- Expert Specialization MoE Solution based on CUTLASS☆26Jan 19, 2026Updated 2 months ago
- torchcomms: a modern PyTorch communications API☆356Updated this week
- Framework to reduce autotune overhead to zero for well known deployments.☆98Sep 19, 2025Updated 6 months ago
- ☆16Apr 2, 2026Updated 2 weeks ago
- [NAACL'25 🏆 SAC Award] Official code for "Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert…☆16Feb 4, 2025Updated last year
- MSCCL++: A GPU-driven communication stack for scalable AI applications☆498Updated this week