mli / transformers-benchmarks
real Transformer TeraFLOPS on various GPUs
☆898Updated last year
Alternatives and similar repositories for transformers-benchmarks:
Users that are interested in transformers-benchmarks are comparing it to the libraries listed below
- Several simple examples for popular neural network toolkits calling custom CUDA operators.☆1,417Updated 3 years ago
- How to use wandb?☆625Updated last year
- 整理 pytorch 单机多 GPU 训练方法与原理☆803Updated 3 years ago
- LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training☆397Updated 2 months ago
- An easy/swift-to-adapt PyTorch-Lighting template. 套壳模板,简单易用,稍改原来Pytorch代码,即可适配Lightning。You can translate your previous Pytorch code much…☆1,423Updated last year
- SwissArmyTransformer is a flexible and powerful library to develop your own Transformer variants.☆1,068Updated 2 months ago
- Best practice for training LLaMA models in Megatron-LM☆645Updated last year
- ☆604Updated 9 months ago
- an implementation of transformer, bert, gpt, and diffusion models for learning purposes☆151Updated 5 months ago
- 该仓库主要记录 大模型(LLMs) 算法工程师相关的面试题☆1,829Updated 2 months ago
- Pytorch❤️ Keras 😋😋☆1,891Updated this week
- The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.☆946Updated this week
- Efficient Training (including pre-training and fine-tuning) for Big Models☆580Updated 8 months ago
- 更纯粹、更高压缩率的Tokenizer☆470Updated 3 months ago
- huggingface mirror download☆567Updated last week
- pytorch distribute tutorials☆117Updated last month
- A quickstart and benchmark for pytorch distributed training.☆1,657Updated 7 months ago
- LLMs interview notes and answers:该仓库主要记录大模型(LLMs)算法工程师相关的面试题和参考答案☆1,222Updated last year
- A fast MoE impl for PyTorch☆1,675Updated last month
- ☆254Updated 2 weeks ago
- Rotary Transformer☆916Updated 3 years ago
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆2,022Updated 3 weeks ago
- Tencent Pre-training framework in PyTorch & Pre-trained Model Zoo☆1,066Updated 7 months ago
- The pure and clear PyTorch Distributed Training Framework.☆276Updated last year
- A collection of phenomenons observed during the scaling of big foundation models, which may be developed into consensus, principles, or l…☆277Updated last year
- Tuning LLMs with no tears💦; Sample Design Engineering (SDE) for more efficient downstream-tuning.☆989Updated 10 months ago
- LLMs interview notes and answers:该仓库主要记录大模型(LLMs)算法工程师相关的面试题和参考答案☆489Updated last year
- Cool Papers - Immersive Paper Discovery☆498Updated this week
- personal chatgpt☆354Updated 3 months ago
- Collaborative Training of Large Language Models in an Efficient Way☆413Updated 6 months ago