Sequence-level 1F1B schedule for LLMs.
☆19Jun 4, 2024Updated 2 years ago
Alternatives and similar repositories for Seq1F1B
Users that are interested in Seq1F1B are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Distributed IO-aware Attention algorithm☆24Sep 24, 2025Updated 8 months ago
- [ACL 2025 main] FR-Spec: Frequency-Ranked Speculative Sampling☆55Jul 15, 2025Updated 11 months ago
- ☆14Jan 12, 2022Updated 4 years ago
- ☆17May 10, 2024Updated 2 years ago
- Integrated Training Platform (ITP) traces used in ElasticFlow paper.☆31Dec 23, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Linear Attention Sequence Parallelism (LASP)☆88Jun 4, 2024Updated 2 years ago
- Deferred Continuous Batching in Resource-Efficient Large Language Model Serving (EuroMLSys 2024)☆19May 28, 2024Updated 2 years ago
- Schedule free optimiser implemented in JAX using Optimistix☆15May 29, 2024Updated 2 years ago
- [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models☆11Dec 13, 2023Updated 2 years ago
- The code for our paper "Neural Architecture Search as Program Transformation Exploration"☆16Apr 28, 2021Updated 5 years ago
- Supplemental materials for The ASPLOS 2025 / EuroSys 2025 Contest on Intra-Operator Parallelism for Distributed Deep Learning☆25May 12, 2025Updated last year
- Zero Bubble Pipeline Parallelism☆459May 7, 2025Updated last year
- ☆16Jan 14, 2025Updated last year
- Light-weight Performance Variance Detection for Production-run Parallel Applications☆16Aug 28, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Scaling Sparse Fine-Tuning to Large Language Models☆19Jan 31, 2024Updated 2 years ago
- Modified Beam Search with periodical restart☆12Sep 12, 2024Updated last year
- Original code base for On Pretraining Data Diversity for Self-Supervised Learning☆14Dec 30, 2024Updated last year
- Efficient Hyper-parameter Tuning at Scale (VLDB'22)☆10Dec 1, 2021Updated 4 years ago
- Code for reproducing experiments performed for Accoridon☆13Jun 11, 2021Updated 5 years ago
- A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup☆36Jan 9, 2023Updated 3 years ago
- Implementation of AAAI 2022 Paper: Go wider instead of deeper☆32Oct 27, 2022Updated 3 years ago
- Fast and memory-efficient exact attention☆22Updated this week
- ☆139May 29, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆24Nov 27, 2025Updated 6 months ago
- ☆78May 4, 2021Updated 5 years ago
- UniRL is a Framework for Unified Multimodal Model Reinforcement Learning☆644Updated this week
- Code Implementation for AutoAttend: Automated Attention Representation Search☆11Jul 26, 2021Updated 4 years ago
- MR project investigating the mediating role of mammographic density in the childhood body size and breast cancer relationship☆11May 14, 2024Updated 2 years ago
- My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated☆37Aug 14, 2024Updated last year
- ☆11Feb 23, 2024Updated 2 years ago
- SGLang Kernel Wheel Index☆23Updated this week
- MUA-RL: MULTI-TURN USER-INTERACTING AGENT REINFORCEMENT LEARNING FOR AGENTIC TOOL USE☆62Nov 5, 2025Updated 7 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- 🤖FFPA: Extends FlashAttention-2 via Split-D for large headdims, 1.5x~3×↑🎉 vs SDPA, up to 430T🎉 on H200.☆310Updated this week
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- Dataset of PiTree project.☆13Dec 27, 2021Updated 4 years ago
- Intelligent Resource Requirement Estimation and Scheduling for Deep Learning Jobs on Distributed GPU Clusters☆16Nov 18, 2021Updated 4 years ago
- Implement matrix multiplication using SIMD.☆13Nov 19, 2016Updated 9 years ago
- A Generic Resource-Aware Hyperparameter Tuning Execution Engine☆15Jan 8, 2022Updated 4 years ago
- ☆15Oct 16, 2018Updated 7 years ago