☆79Dec 15, 2023Updated 2 years ago
Alternatives and similar repositories for AscendSpeed
Users that are interested in AscendSpeed are comparing it to the libraries listed below
Sorting:
- Ascend PyTorch adapter (torch_npu). Mirror of https://gitcode.com/Ascend/pytorch☆491Updated this week
- ☆18Mar 4, 2025Updated last year
- Tensorflow implementation of DeepMind's Tacotron-2 (without wavenet)☆11Jul 12, 2019Updated 6 years ago
- XVERSE-MoE-A4.2B: A multilingual large language model developed by XVERSE Technology Inc.☆39May 8, 2024Updated last year
- 🌟Official code of our AAAI26 paper 🔍WebFilter☆37Nov 9, 2025Updated 3 months ago
- ☆18Jun 8, 2021Updated 4 years ago
- Manages vllm-nccl dependency☆17Jun 3, 2024Updated last year
- Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆40Nov 11, 2024Updated last year
- ☆47Dec 13, 2024Updated last year
- ☆20Sep 28, 2024Updated last year
- ☆19Dec 6, 2023Updated 2 years ago
- ☆335Jun 24, 2024Updated last year
- ☆97Mar 26, 2025Updated 11 months ago
- Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Mult…☆41Mar 17, 2024Updated last year
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆139Jun 12, 2024Updated last year
- Implementation of "Audio Retrieval with Natural Language Queries", INTERSPEECH 2021, PyTorch☆26Aug 18, 2023Updated 2 years ago
- ☆23Jan 7, 2022Updated 4 years ago
- ☆21Jan 18, 2017Updated 9 years ago
- A machine learning competition in Automated Deep Learning (AutoDL), co-organized by ChaLearn, Google and 4Paradigm. Accepted at NeurIPS 2…☆22Dec 10, 2020Updated 5 years ago
- play gemm with tvm☆92Jul 22, 2023Updated 2 years ago
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…☆164Jan 12, 2026Updated last month
- A baseline repository of Auto-Parallelism in Training Neural Networks☆147Jun 25, 2022Updated 3 years ago
- XVERSE-7B: A multilingual large language model developed by XVERSE Technology Inc.☆53Apr 9, 2024Updated last year
- Community maintained hardware plugin for vLLM on Ascend☆1,711Updated this week
- ☆219Aug 17, 2023Updated 2 years ago
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆1,436Mar 20, 2024Updated last year
- Standalone Flash Attention v2 kernel without libtorch dependency☆114Sep 10, 2024Updated last year
- InfiniStore: an elastic serverless cloud storage system (VLDB'23)☆24May 5, 2023Updated 2 years ago
- Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.☆44Feb 27, 2025Updated last year
- ☆115Aug 26, 2024Updated last year
- Compare different hardware platforms via the Roofline Model for LLM inference tasks.☆119Mar 13, 2024Updated last year
- Ring attention implementation with flash attention☆987Sep 10, 2025Updated 5 months ago
- ☆183Jan 28, 2026Updated last month
- ☆26Aug 14, 2022Updated 3 years ago
- LLM training technologies developed by kwai☆70Jan 21, 2026Updated last month
- [ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”☆127Jan 14, 2025Updated last year
- Implementation of Global Style Token Tacotron in TensorFlow2☆26Sep 28, 2020Updated 5 years ago
- ☆36Feb 6, 2026Updated 3 weeks ago
- A high-performance distributed deep learning system targeting large-scale and automated distributed training. If you have any interests, …☆124Dec 18, 2023Updated 2 years ago