lvyufeng / Cybertron
mindspore implementation of transformers
☆66Updated 2 years ago
Alternatives and similar repositories for Cybertron:
Users that are interested in Cybertron are comparing it to the libraries listed below
- Natural Language Processing Tutorial for MindSpore Users☆142Updated last year
- 《动手学深度学习》的MindSpore实现。供MindSpore学习者配合李沐老师课程使用。☆115Updated last year
- 基于Gated Attention Unit的Transformer模型(尝鲜版)☆97Updated 2 years ago
- an implementation of transformer, bert, gpt, and diffusion models for learning purposes☆153Updated 6 months ago
- MindSpore implementations of Generative Adversarial Networks.☆22Updated 2 years ago
- ☆116Updated this week
- Inference code for LLaMA models☆118Updated last year
- ☆70Updated 2 months ago
- ☆78Updated last year
- 一个用于学习的仿Pytorch纯Python实现的自动求导工具。☆51Updated 11 months ago
- A MoE impl for PyTorch, [ATC'23] SmartMoE☆62Updated last year
- ☆52Updated last year
- Efficient, Low-Resource, Distributed transformer implementation based on BMTrain☆251Updated last year
- ☆191Updated 6 months ago
- Paper List for In-context Learning 🌷☆181Updated last year
- ☆84Updated last year
- ☆33Updated 4 months ago
- Model Compression for Big Models☆160Updated last year
- A paper list about diffusion models for natural language processing.☆182Updated last year
- ☆18Updated 2 years ago
- pytorch distribute tutorials☆123Updated this week
- ATC23 AE☆45Updated last year
- Must-read papers on improving efficiency for pre-trained language models.☆103Updated 2 years ago
- CCKS2023-PromptCBLUE: Code implement of TianChi completition☆18Updated last year
- Pretrain CPM-1☆51Updated 4 years ago
- an implementation of parallel skills like amp, ddp, pp, tp for learning purposes☆12Updated last year
- The record of what I‘ve been through.☆98Updated 3 months ago
- 擂台赛3-大规模预训练调优比赛的示例代码与baseline实现☆38Updated 2 years ago
- Triton Documentation in Chinese Simplified / Triton 中文文档☆66Updated last week
- From Llama to Deepseek, grpo/mtp implemented. With pt/sft/lora/qlora included☆26Updated 3 weeks ago