lvyufeng / cybertron-aiLinks

mindspore implementation of transformers

☆68

Alternatives and similar repositories for cybertron-ai

Users that are interested in cybertron-ai are comparing it to the libraries listed below

Sorting:

lvyufeng / mindspore-nlp-tutorial
Natural Language Processing Tutorial for MindSpore Users
☆142Updated last year
nlp-greyfoss / metagrad
一个用于学习的仿Pytorch纯Python实现的自动求导工具。
☆51Updated last year
firechecking / CleanTransformer
an implementation of transformer, bert, gpt, and diffusion models for learning purposes
☆155Updated 9 months ago
cnstark / gputasker
An awesome gpu tasks scheduler. 轻量好用的GPU机群任务调度工具。觉得有用可以点个star
☆185Updated 2 years ago
thu-pacman / SmartMoE-AE
ATC23 AE
☆45Updated 2 years ago
mindspore-courses / d2l-mindspore
《动手学深度学习》的MindSpore实现。供MindSpore学习者配合李沐老师课程使用。
☆118Updated last year
lvyufeng / MindSpore-GAN
MindSpore implementations of Generative Adversarial Networks.
☆22Updated 2 years ago
mdy666 / mdy_triton
☆140Updated last week
godweiyang / GrabGPU
一款便捷的抢占显卡脚本
☆339Updated 5 months ago
sunkx109 / llama
Inference code for LLaMA models
☆122Updated last year
Oneflow-Inc / libai
LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training
☆408Updated 3 weeks ago
lvyufeng / easy_mindspore_bk
☆18Updated 2 years ago
OpenMOSS / CoLLiE
Collaborative Training of Large Language Models in an Efficient Way
☆416Updated 10 months ago
nengwp / Lion-vs-Adam
Lion and Adam optimization comparison
☆61Updated 2 years ago
cauyxy / bilivideos
☆52Updated 2 years ago
ZRayZzz / flash-attention-v100
☆48Updated last year
Ascend / AscendSpeed
☆79Updated last year
ZhuiyiTechnology / GAU-alpha
基于Gated Attention Unit的Transformer模型（尝鲜版）
☆98Updated 2 years ago
CoinCheung / gdGPT
Train llm (bloom, llama, baichuan2-7b, chatglm3-6b) with deepspeed pipeline mode. Faster than zero/zero++/fsdp.
☆97Updated last year
OpenBMB / BMCook
Model Compression for Big Models
☆163Updated 2 years ago
NiuTrans / Introduction-to-Transformers
An introduction to basic concepts of Transformers and key techniques of their recent advances.
☆50Updated last year
sxontheway / Keep-Learning
The record of what I‘ve been through.
☆99Updated 5 months ago
shreyansh26 / FlashAttention-PyTorch
Implementation of FlashAttention in PyTorch
☆155Updated 6 months ago
chunhuizhang / pytorch_distribute_tutorials
pytorch distribute tutorials
☆142Updated last month
hyperai / triton-cn
Triton Documentation in Chinese Simplified / Triton 中文文档
☆74Updated 3 months ago
dilab-zju / self-speculative-decoding
Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**
☆194Updated 5 months ago
bojone / papers.cool
Cool Papers - Immersive Paper Discovery
☆572Updated last month
OpenBMB / BMPrinciples
A collection of phenomenons observed during the scaling of big foundation models, which may be developed into consensus, principles, or l…
☆281Updated last year
dhcode-cpp / NSA-pytorch
DeepSeek Native Sparse Attention pytorch implementation
☆73Updated 4 months ago
zms1999 / SmartMoE
A MoE impl for PyTorch, [ATC'23] SmartMoE
☆64Updated 2 years ago