mli / transformers-benchmarks
real Transformer TeraFLOPS on various GPUs
☆898Updated last year
Alternatives and similar repositories for transformers-benchmarks:
Users that are interested in transformers-benchmarks are comparing it to the libraries listed below
- LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training☆403Updated last week
- Rotary Transformer☆941Updated 3 years ago
- How to use wandb?☆639Updated last year
- Several simple examples for popular neural network toolkits calling custom CUDA operators.☆1,465Updated 4 years ago
- an implementation of transformer, bert, gpt, and diffusion models for learning purposes☆154Updated 6 months ago
- A quickstart and benchmark for pytorch distributed training.☆1,661Updated 9 months ago
- Best practice for training LLaMA models in Megatron-LM☆650Updated last year
- 整理 pytorch 单机多 GPU 训练方法与原理☆818Updated 3 years ago
- A fast MoE impl for PyTorch☆1,713Updated 2 months ago
- pytorch distribute tutorials☆127Updated 2 weeks ago
- ☆608Updated 11 months ago
- Ascend PyTorch adapter (torch_npu). Mirror of https://gitee.com/ascend/pytorch☆345Updated last week
- An easy/swift-to-adapt PyTorch-Lighting template. 套壳模板,简单易用,稍改原来Pytorch代码,即可适配Lightning。You can translate your previous Pytorch code much…☆1,448Updated last year
- RoFormer V1 & V2 pytorch☆496Updated 2 years ago
- Efficient Training (including pre-training and fine-tuning) for Big Models☆587Updated 2 weeks ago
- huggingface mirror download☆577Updated last month
- SwissArmyTransformer is a flexible and powerful library to develop your own Transformer variants.☆1,077Updated 4 months ago
- 更纯粹、更高压缩率的Tokenizer☆480Updated 5 months ago
- Cool Papers - Immersive Paper Discovery☆529Updated last month
- PyTorch Project Specification.☆679Updated 3 years ago
- A plug-and-play library for parameter-efficient-tuning (Delta Tuning)☆1,027Updated 7 months ago
- Pytorch❤️ Keras 😋😋☆1,920Updated last month
- ☆161Updated last month
- Efficient Inference for Big Models☆583Updated 2 years ago
- 看图学大模型☆296Updated 9 months ago
- 一款便捷的抢占显卡脚本☆329Updated 3 months ago
- Tutel MoE: Optimized Mixture-of-Experts Library, Support DeepSeek FP8/FP4☆818Updated this week
- ☆256Updated 2 months ago
- Train a 1B LLM with 1T tokens from scratch by personal☆633Updated last week
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆1,386Updated last year