lvyufeng / cybertron-aiLinks
mindspore implementation of transformers
☆68Updated 2 years ago
Alternatives and similar repositories for cybertron-ai
Users that are interested in cybertron-ai are comparing it to the libraries listed below
Sorting:
- Natural Language Processing Tutorial for MindSpore Users☆140Updated last year
- An awesome gpu tasks scheduler. 轻量好用的GPU机群任务调度工具。觉得有用可以点个star☆194Updated 3 years ago
- an implementation of transformer, bert, gpt, and diffusion models for learning purposes☆159Updated last year
- ☆79Updated 2 years ago
- Model Compression for Big Models☆166Updated 2 years ago
- ☆82Updated last month
- A collection of phenomenons observed during the scaling of big foundation models, which may be developed into consensus, principles, or l…☆285Updated 2 years ago
- MindSpore implementations of Generative Adversarial Networks.☆23Updated 3 years ago
- 《动手学深度学习》的MindSpore实现。供MindSpore学习者配合李沐老师课程使用。☆124Updated 2 years ago
- LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training☆406Updated 5 months ago
- ATC23 AE☆47Updated 2 years ago
- 一款便捷的抢占显卡脚本☆389Updated 2 weeks ago
- ☆150Updated 6 months ago
- The record of what I‘ve been through. Now moved to Notion. See link below☆102Updated 11 months ago
- ☆51Updated 2 years ago
- Lion and Adam optimization comparison☆64Updated 2 years ago
- Efficient, Low-Resource, Distributed transformer implementation based on BMTrain☆264Updated 2 years ago
- Collaborative Training of Large Language Models in an Efficient Way☆417Updated last year
- Inference code for LLaMA models☆128Updated 2 years ago
- Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**☆214Updated 10 months ago
- pytorch distribute tutorials☆164Updated 6 months ago
- ☆84Updated 2 years ago
- 基于Gated Attention Unit的Transformer模型(尝鲜版)☆98Updated 2 years ago
- all courses' homework☆118Updated 5 years ago
- 更纯粹、更高压缩率的Tokenizer☆488Updated last year
- RoFormer V1 & V2 pytorch☆517Updated 3 years ago
- 一个用于学习的仿Pytorch纯Python实现的自动 求导工具。☆51Updated last year
- A MoE impl for PyTorch, [ATC'23] SmartMoE☆71Updated 2 years ago
- DeepSpeed教程 & 示例注释 & 学习笔记 (大模型高效训练)☆184Updated 2 years ago
- (Unofficial) PyTorch implementation of grouped-query attention (GQA) from "GQA: Training Generalized Multi-Query Transformer Models from …☆188Updated last year