lvyufeng / Cybertron
mindspore implementation of transformers
☆66Updated last year
Related projects ⓘ
Alternatives and complementary repositories for Cybertron
- Natural Language Processing Tutorial for MindSpore Users☆140Updated 7 months ago
- The official code for paper "parallel speculative decoding with adaptive draft length."☆24Updated 2 months ago
- ATC23 AE☆43Updated last year
- 《动手学深度学习》的MindSpore实现。供MindSpore学习者配合李沐老师课程使用。☆108Updated last year
- ☆18Updated 2 years ago
- 基于Gated Attention Unit的Transformer模型(尝鲜版)☆97Updated last year
- The blog, read report and code example for AGI/LLM related knowledge.☆19Updated 3 months ago
- Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models☆184Updated 6 months ago
- Code for a New Loss for Mitigating the Bias of Learning Difficulties in Generative Language Models☆53Updated 5 months ago
- A MoE impl for PyTorch, [ATC'23] SmartMoE☆57Updated last year
- pytorch distribute tutorials☆83Updated last month
- Must-read papers on improving efficiency for pre-trained language models.☆102Updated 2 years ago
- ☆155Updated last month
- an implementation of transformer, bert, gpt, and diffusion models for learning purposes☆147Updated last month
- ☆82Updated last year
- ☆74Updated 11 months ago
- A paper list about diffusion models for natural language processing.☆174Updated last year
- MindSpore implementations of Generative Adversarial Networks.☆21Updated 2 years ago
- Efficient, Low-Resource, Distributed transformer implementation based on BMTrain☆244Updated 11 months ago
- [ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408☆191Updated last year
- Inference code for LLaMA models☆109Updated last year
- 擂台赛3-大规模预训练调优比赛的示例代码与baseline实现☆38Updated 2 years ago
- ☆64Updated 3 months ago
- Model Compression for Big Models☆151Updated last year
- ☆51Updated last year
- Official implementation of TransNormerLLM: A Faster and Better LLM☆229Updated 9 months ago
- Grab GPU whenever available☆279Updated 2 years ago
- Multi-Candidate Speculative Decoding☆28Updated 6 months ago
- Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**☆139Updated 5 months ago
- ☆145Updated this week