THUDM / SwissArmyTransformerLinks

SwissArmyTransformer is a flexible and powerful library to develop your own Transformer variants.

☆1,100

Alternatives and similar repositories for SwissArmyTransformer

Users that are interested in SwissArmyTransformer are comparing it to the libraries listed below

Sorting:

ZhuiyiTechnology / roformer
Rotary Transformer
☆1,057Updated 3 years ago
baaivision / Emu
Emu Series: Generative Multimodal Models from BAAI
☆1,761Updated last year
eric-ai-lab / MiniGPT-5
Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"
☆863Updated 6 months ago
AetherCortex / Llama-X
Open Academic Research on Improving LLaMA to SOTA LLM
☆1,614Updated 2 years ago
Tencent / TencentPretrain
Tencent Pre-training framework in PyTorch & Pre-trained Model Zoo
☆1,081Updated last year
OpenBMB / BMTrain
Efficient Training (including pre-training and fine-tuning) for Big Models
☆613Updated last month
thu-ml / unidiffuser
Code and models for the paper "One Transformer Fits All Distributions in Multi-Modal Diffusion"
☆1,449Updated 2 years ago
InternLM / InternLM-techreport
☆903Updated 2 years ago
OpenLMLab / LOMO
LOMO: LOw-Memory Optimization
☆991Updated last year
GanjinZero / RRHF
[NIPS2023] RRHF & Wombat
☆811Updated 2 years ago
FlagAI-Open / Aquila2
The official repo of Aquila2 series proposed by BAAI, including pretrained & chat large language models.
☆446Updated last year
bigscience-workshop / Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
☆1,426Updated last year
git-cloner / aliendao
huggingface mirror download
☆588Updated 8 months ago
open-compass / MixtralKit
A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI
☆773Updated last year
thunlp / OpenDelta
A plug-and-play library for parameter-efficient-tuning (Delta Tuning)
☆1,037Updated last year
THUDM / P-tuning-v2
An optimized deep prompt tuning strategy comparable to fine-tuning across scales and tasks
☆2,067Updated 2 years ago
pjlab-sys4nlp / llama-moe
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
☆996Updated 11 months ago
laekov / fastmoe
A fast MoE impl for PyTorch
☆1,820Updated 9 months ago
bojone / rerope
Rectified Rotary Position Embeddings
☆384Updated last year
mli / transformers-benchmarks
real Transformer TeraFLOPS on various GPUs
☆915Updated last year
XueFuzhao / InstructionWild
☆458Updated last year
OpenMOSS / CoLLiE
Collaborative Training of Large Language Models in an Efficient Way
☆416Updated last year
baaivision / Emu3
Next-Token Prediction is All You Need
☆2,257Updated 2 weeks ago
WeOpenML / PandaLM
☆923Updated last year
lucidrains / MEGABYTE-pytorch
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
☆654Updated 11 months ago
flageval-baai / FlagEval
FlagEval is an evaluation toolkit for AI large foundation models.
☆339Updated 7 months ago
jianzhnie / LLamaTuner
Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.
☆619Updated 10 months ago
deepspeedai / Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
☆2,197Updated 3 months ago
bojone / bytepiece
更纯粹、更高压缩率的Tokenizer
☆486Updated last year
OpenBMB / BMInf
Efficient Inference for Big Models
☆586Updated 2 years ago