LydiaXiaohongLi / Megatron-DeepSpeedLinks

Ongoing research training transformer language models at scale, including: BERT & GPT-2

☆19

Alternatives and similar repositories for Megatron-DeepSpeed

Users that are interested in Megatron-DeepSpeed are comparing it to the libraries listed below

Sorting:

keezen / ntk_alibi
NTK scaled version of ALiBi position encoding in Transformer.
☆69Updated 2 years ago
seanzhang-zhichen / baichuan-Dynamic-NTK-ALiBi
百川Dynamic NTK-ALiBi的代码实现：无需微调即可推理更长文本
☆49Updated 2 years ago
CLUEbenchmark / SuperCLUE-Math6
SuperCLUE-Math6：新一代中文原生多轮多步数学推理数据集的探索之旅
☆60Updated last year
ProjectD-AI / LLaMA-Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
☆68Updated 2 years ago
OpenLMLab / scaling-rope
code for Scaling Laws of RoPE-based Extrapolation
☆73Updated 2 years ago
beichao1314 / Open-Llama
The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.
☆67Updated 2 years ago
yanqiangmiffy / how-to-train-tokenizer
怎么训练一个LLM分词器
☆153Updated 2 years ago
genggui001 / Megatron-DeepSpeed-Llama
☆84Updated 2 years ago
jiahe7ay / infini-mini-transformer
This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and train…
☆58Updated last year
llmeval / LLMEval-1
中文大语言模型评测第一期
☆110Updated 2 years ago
FlagOpen / FlagInstruct
☆172Updated 2 years ago
xv44586 / Chinese-instruction-datasets
中文 Instruction tuning datasets
☆137Updated last year
Oneflow-Inc / one-glm
A more efficient GLM implementation!
☆54Updated 2 years ago
zejunwang1 / LLMTuner
大语言模型指令调优工具（支持 FlashAttention）
☆178Updated last year
vxfla / kanchil
Kanchil（鼷鹿）是世界上最小的偶蹄目动物，这个开源项目意在探索小模型（6B以下）是否也能具备和人类偏好对齐的能力。
☆113Updated 2 years ago
Longyichen / Alpaca-family-library
Summarize all open source Large Languages Models and low-cost replication methods for Chatgpt.
☆137Updated 2 years ago
Felixgithub2017 / MMCU
MEASURING MASSIVE MULTITASK CHINESE UNDERSTANDING
☆89Updated last year
aplmikex / deduplication_mnbvc
文本去重
☆76Updated last year
THUDM / icetk
A unified tokenization tool for Images, Chinese and English.
☆151Updated 2 years ago
FudanNLPLAB / CBook-150K
中文图书语料MD5链接
☆217Updated last year
OpenLMLab / ChatZoo
Light local website for displaying performances from different chat models.
☆87Updated last year
CLUEbenchmark / ZeroCLUE
零样本学习测评基准，中文版
☆57Updated 4 years ago
CSHaitao / ChatGLM_mutli_gpu_tuning
deepspeed+trainer简单高效实现多卡微调大模型
☆129Updated 2 years ago
pleisto / yuren-baichuan-7b
基于baichuan-7b的开源多模态大语言模型
☆72Updated last year
zhoucz97 / awesome-ChatGPT
ChatGPT相关资源汇总
☆56Updated 2 years ago
Langboat / mengzi-zero-shot
NLU & NLG (zero-shot) depend on mengzi-t5-base-mt pretrained model
☆76Updated 3 years ago
thu-coai / OPD
OPD: Chinese Open-Domain Pre-trained Dialogue Model
☆75Updated 2 years ago
sufengniu / RefGPT
☆163Updated 2 years ago
mutonix / RefGPT
☆98Updated last year
zhangzhao219 / WSDM-Cup-2024
1st Solution For Conversational Multi-Doc QA Workshop & International Challenge @ WSDM'24 - Xiaohongshu.Inc
☆161Updated 3 months ago