ZhuiyiTechnology / GAU-alphaLinks

基于Gated Attention Unit的Transformer模型（尝鲜版）

☆97

Alternatives and similar repositories for GAU-alpha

Users that are interested in GAU-alpha are comparing it to the libraries listed below

Sorting:

JunnYu / FLASHQuad_pytorch
FLASHQuad_pytorch
☆67Updated 3 years ago
ZhuiyiTechnology / roformer-v2
RoFormer升级版
☆153Updated 2 years ago
nengwp / Lion-vs-Adam
Lion and Adam optimization comparison
☆62Updated 2 years ago
JunnYu / RoFormer_pytorch
RoFormer V1 & V2 pytorch
☆506Updated 3 years ago
bojone / tiger
A Tight-fisted Optimizer
☆48Updated 2 years ago
keezen / ntk_alibi
NTK scaled version of ALiBi position encoding in Transformer.
☆69Updated last year
DRSY / EMO
[ICLR 2024]EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling(https://arxiv.org/abs/2310.04691)
☆123Updated last year
Hzfinfdu / PLMTuningCompetition
擂台赛3-大规模预训练调优比赛的示例代码与baseline实现
☆38Updated 2 years ago
bojone / r-drop
R-Drop方法在中文任务上的简单实验
☆91Updated 3 years ago
1140310118 / tdlm
实现了Transformer中的几种位置编码方案
☆44Updated 3 years ago
princeton-nlp / CoFiPruning
[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408
☆196Updated 2 years ago
bojone / LST-CLUE
Ladder Side-Tuning在CLUE上的简单尝试
☆21Updated 3 years ago
TobiasLee / Awesome-Efficient-PLM
Must-read papers on improving efficiency for pre-trained language models.
☆104Updated 2 years ago
CLUEbenchmark / SuperCLUE-Math6
SuperCLUE-Math6：新一代中文原生多轮多步数学推理数据集的探索之旅
☆59Updated last year
TsinghuaAI / CUGE
☆53Updated 3 years ago
TsinghuaAI / CPM-2-Finetune
Finetune CPM-2
☆82Updated 2 years ago
YJiangcm / Lion
[EMNLP 2023] Lion: Adversarial Distillation of Proprietary Large Language Models
☆210Updated last year
TsinghuaAI / CPM-1-Pretrain
Pretrain CPM-1
☆53Updated 4 years ago
wxl1999 / PLMPapers
A paper list of pre-trained language models (PLMs).
☆81Updated 3 years ago
THUDM / icetk
A unified tokenization tool for Images, Chinese and English.
☆151Updated 2 years ago
TsinghuaAI / CPM
Introduction to CPM
☆165Updated 3 years ago
hazdzz / tiger
A Tight-fisted Optimizer (Tiger), implemented in PyTorch.
☆12Updated last year
bojone / univae
基于Transformer的单模型、多尺度的VAE模型
☆57Updated 4 years ago
jiahe7ay / infini-mini-transformer
This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and train…
☆58Updated last year
bojone / shuffle
Python下shuffle几百G文件
☆33Updated 3 years ago
Langboat / mengzi-zero-shot
NLU & NLG (zero-shot) depend on mengzi-t5-base-mt pretrained model
☆74Updated 2 years ago
THUDM / iPrompt
Code, Data and Demo for Paper: Controllable Generation from Pre-trained Language Models via Inverse Prompting
☆122Updated 2 years ago
Spico197 / watchmen
😎 A simple and easy-to-use toolkit for GPU scheduling.
☆45Updated 2 months ago
JunnYu / GAU-alpha-pytorch
GAU-alpha-pytorch
☆19Updated 3 years ago
RUCKBReasoning / GLM-Dialog
☆59Updated 2 years ago