☆94Jul 3, 2022Updated 3 years ago
Alternatives and similar repositories for DT-FM
Users that are interested in DT-FM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)☆118Jan 13, 2022Updated 4 years ago
- Website for Systems Research Seminar at UIUC☆21May 7, 2026Updated last month
- A resilient distributed training framework☆99Apr 11, 2024Updated 2 years ago
- (NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.☆44Nov 4, 2022Updated 3 years ago
- AI model training on heterogeneous, geo-distributed resources☆44Nov 24, 2025Updated 7 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆15Sep 15, 2022Updated 3 years ago
- ☆48Sep 13, 2025Updated 9 months ago
- ☆250Jul 25, 2024Updated last year
- Memory-efficient transformer. Work in progress.☆19Sep 17, 2022Updated 3 years ago
- [NeurIPS 2022] JAX/Haiku implementation of "On Privacy and Personalization in Cross-Silo Federated Learning"☆27Apr 16, 2023Updated 3 years ago
- Accommodating Large Language Model Training over Heterogeneous Environment.☆32Mar 13, 2025Updated last year
- Early exit ensembles☆12Dec 4, 2021Updated 4 years ago
- Code for the paper "Secure Distributed Training at Scale" (ICML 2022)☆16Feb 4, 2025Updated last year
- A comprehensive overview of Data Distillation and Condensation (DDC). DDC is a data-centric task where a representative (i.e., small but …☆13Dec 1, 2022Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Reading seminar in Harvard Cloud Networking and Systems Group☆16Aug 29, 2022Updated 3 years ago
- Solidity contracts for the decentralized Prime Network protocol☆26Jul 6, 2025Updated 11 months ago
- [ICML 2024] Serving LLMs on heterogeneous decentralized clusters.☆37May 6, 2024Updated 2 years ago
- A framework for PyTorch to enable fault management for collective communication libraries (CCL) such as NCCL☆20Feb 9, 2026Updated 4 months ago
- Federated reconnaissance mini-ImageNet benchmark and baseline models☆13Sep 2, 2021Updated 4 years ago
- Large scale graph learning on a single machine.☆167Feb 25, 2025Updated last year
- Integrated Training Platform (ITP) traces used in ElasticFlow paper.☆31Dec 23, 2022Updated 3 years ago
- Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.☆72Mar 20, 2025Updated last year
- Sample, estimate, aggregate: A recipe for causal discovery foundation models☆17Jun 21, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Training a model similar to OpenAI DALL-E with volunteers from all over the Internet using hivemind and dalle-pytorch (NeurIPS 2021 demo)☆27May 29, 2023Updated 3 years ago
- Some CS notes during Jiawei's undergrad.☆33Jan 6, 2022Updated 4 years ago
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆59Oct 29, 2023Updated 2 years ago
- [ASPLOS'25] Towards End-to-End Optimization of LLM-based Applications with Ayo☆74Mar 11, 2026Updated 3 months ago
- SpotServe: Serving Generative Large Language Models on Preemptible Instances☆134Feb 22, 2024Updated 2 years ago
- Surrogate-based Hyperparameter Tuning System☆30Jun 29, 2023Updated 3 years ago
- Swan Benchmark Suite☆13Sep 17, 2025Updated 9 months ago
- Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training☆1,890Updated this week
- Smart Contract tools to help streamline Ethereum dapp development and deployment☆13Oct 16, 2017Updated 8 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Proof of concept, using Sysdig metrics as the decision variable for a Kubernetes scheduler☆14Nov 3, 2017Updated 8 years ago
- ☆62May 4, 2024Updated 2 years ago
- This technique modifies image data so that any model trained on it will bear an identifiable mark.☆45Aug 13, 2021Updated 4 years ago
- Implementing a Turing-complete computer (OISC) within a zk-SNARKS circuit.☆13Nov 24, 2021Updated 4 years ago
- MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation☆18Sep 2, 2024Updated last year
- [ICLR 2025 Spotlight] Code release for "Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late In Training"☆19Feb 20, 2025Updated last year
- "Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts" (NeurIPS 2020), original PyTorch implemen…☆56Nov 5, 2020Updated 5 years ago