WangHuiNEU / Transformer_KnowlegdeLinks

从底层机理了解Transformer

☆27

Alternatives and similar repositories for Transformer_Knowlegde

Users that are interested in Transformer_Knowlegde are comparing it to the libraries listed below

Sorting:

firechecking / CleanTransformer
an implementation of transformer, bert, gpt, and diffusion models for learning purposes
☆159Updated last year
simtony / mlrunner
A light-weight script for maintaining a LOT of machine learning experiments.
☆92Updated 3 years ago
ZhuiyiTechnology / roformer
Rotary Transformer
☆1,057Updated 3 years ago
ZhuiyiTechnology / GAU-alpha
基于Gated Attention Unit的Transformer模型（尝鲜版）
☆98Updated 2 years ago
JunnYu / RoFormer_pytorch
RoFormer V1 & V2 pytorch
☆515Updated 3 years ago
cauyxy / bilivideos
☆51Updated 2 years ago
OpenBMB / BMPrinciples
A collection of phenomenons observed during the scaling of big foundation models, which may be developed into consensus, principles, or l…
☆285Updated 2 years ago
dropreg / R-Drop
☆882Updated last year
bojone / bytepiece
更纯粹、更高压缩率的Tokenizer
☆486Updated last year
bobo0810 / LearnDeepSpeed
DeepSpeed教程 & 示例注释 & 学习笔记（大模型高效训练）
☆183Updated 2 years ago
mli / transformers-benchmarks
real Transformer TeraFLOPS on various GPUs
☆915Updated last year
suu990901 / LLaMA-MiLe-Loss
Code for a New Loss for Mitigating the Bias of Learning Difficulties in Generative Language Models
☆66Updated 9 months ago
bojone / rerope
Rectified Rotary Position Embeddings
☆384Updated last year
lucidrains / FLASH-pytorch
Implementation of the Transformer variant proposed in "Transformer Quality in Linear Time"
☆371Updated 2 years ago
OctopusMind / DPO
dpo算法实现
☆48Updated last year
OpenBMB / ModelCenter
Efficient, Low-Resource, Distributed transformer implementation based on BMTrain
☆264Updated 2 years ago
Glanvery / LLM-Travel
欢迎来到 "LLM-travel" 仓库！探索大语言模型（LLM）的奥秘 🚀。致力于深入理解、探讨以及实现与大模型相关的各种技术、原理和应用。
☆354Updated last year
nengwp / Lion-vs-Adam
Lion and Adam optimization comparison
☆64Updated 2 years ago
infly-ai / INF-LLM
The official repo of INF-34B models trained by INF Technology.
☆34Updated last year
IndexFziQ / Diffusion4NLP-Papers
A paper list about diffusion models for natural language processing.
☆182Updated 2 years ago
WangHuiNEU / llm
The Roadmap for LLMs
☆86Updated 2 years ago
xxcheng0708 / pytorch-model-train-template
pytorch单精度、半精度、混合精度、单卡、多卡（DP / DDP）、FSDP、DeepSpeed模型训练代码，并对比不同方法的训练速度以及GPU内存的使用
☆127Updated last year
dongguanting / In-Context-Learning_PaperList
Paper List for In-context Learning 🌷
☆188Updated last year
godweiyang / GrabGPU
一款便捷的抢占显卡脚本
☆381Updated 10 months ago
akaihaoshuai / baby-llama2-chinese_cybertron
使用单个24G显卡，从0开始训练LLM
☆55Updated 4 months ago
Outsider565 / LoRA-GA
☆215Updated last week
l294265421 / alpaca-rlhf
Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat
☆116Updated 2 years ago
codecaution / Awesome-Mixture-of-Experts-Papers
A curated reading list of research in Mixture-of-Experts(MoE).
☆651Updated last year
OpenRL-Lab / Wandb_Tutorial
How to use wandb?
☆685Updated 2 years ago
HarderThenHarder / RLLoggingBoard
A visuailzation tool to make deep understaning and easier debugging for RLHF training.
☆271Updated 9 months ago