OpenBMB / BMTrain
Efficient Training (including pre-training and fine-tuning) for Big Models
☆575Updated 6 months ago
Alternatives and similar repositories for BMTrain:
Users that are interested in BMTrain are comparing it to the libraries listed below
- Efficient, Low-Resource, Distributed transformer implementation based on BMTrain☆246Updated last year
- Model Compression for Big Models☆157Updated last year
- Best practice for training LLaMA models in Megatron-LM☆644Updated last year
- Efficient Inference for Big Models☆578Updated 2 years ago
- Collaborative Training of Large Language Models in an Efficient Way☆411Updated 5 months ago
- ☆456Updated 8 months ago
- [NIPS2023] RRHF & Wombat☆799Updated last year
- LongBench v2 and LongBench (ACL 2024)☆780Updated last month
- train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism☆215Updated last year
- ☆318Updated 7 months ago
- A plug-and-play library for parameter-efficient-tuning (Delta Tuning)☆1,013Updated 5 months ago
- The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.☆851Updated last week
- ☆304Updated last year
- ☆278Updated 9 months ago
- Live Training for Open-source Big Models☆508Updated last year
- Implementation of Chinese ChatGPT☆287Updated last year
- A flexible and efficient training framework for large-scale alignment tasks☆303Updated last week
- Naive Bayes-based Context Extension☆320Updated 2 months ago
- 更纯粹、更高压缩率的Tokenizer☆471Updated 2 months ago
- 大模型多维度中文对齐评测基准 (ACL 2024)☆359Updated 6 months ago
- ☆728Updated 8 months ago
- ⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)☆922Updated 2 months ago
- ☆152Updated this week
- Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.☆592Updated 3 weeks ago
- FlagEval is an evaluation toolkit for AI large foundation models.☆319Updated 7 months ago
- ☆903Updated 8 months ago
- A List of Big Models☆341Updated last year
- Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]☆534Updated 2 months ago
- Code used for sourcing and cleaning the BigScience ROOTS corpus☆307Updated last year
- [ACL'24 Outstanding] Data and code for L-Eval, a comprehensive long context language models evaluation benchmark☆369Updated 7 months ago