dourgey / qwen2_moe_mergekitLinks

根据Qwen2（Qwen1.5）模型生成qwen2 MoE模型的工具

☆16

Alternatives and similar repositories for qwen2_moe_mergekit

Users that are interested in qwen2_moe_mergekit are comparing it to the libraries listed below

Sorting:

hengjiUSTC / learn-llm
☆108Updated 6 months ago
jiahe7ay / infini-mini-transformer
This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and train…
☆56Updated last year
yuanzhoulvpi2017 / SentenceEmbedding
☆110Updated 11 months ago
zexuanqiu / CLongEval
CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models
☆40Updated last year
sugarandgugu / Simple-Trl-Training
基于DPO算法微调语言大模型，简单好上手。
☆39Updated 11 months ago
yongzhuo / qwen2-sft
Qwen1.5-SFT(阿里, Ali), Qwen_Qwen1.5-2B-Chat/Qwen_Qwen1.5-7B-Chat微调(transformers)/LORA(peft)/推理
☆62Updated last year
beichao1314 / Open-Llama
The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.
☆65Updated 2 years ago
RUC-GSAI / Llama-3-SynE
Llama-3-SynE: A Significantly Enhanced Version of Llama-3 with Advanced Scientific Reasoning and Chinese Language Capabilities | 继续预训练提升 …
☆32Updated last week
taishan1994 / Llama3.1-Finetuning
对llama3进行全参微调、lora微调以及qlora微调。
☆198Updated 8 months ago
IronBeliever / CaR
Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation
☆79Updated 6 months ago
yanqiangmiffy / how-to-train-tokenizer
怎么训练一个LLM分词器
☆149Updated last year
wjn1996 / ChatGLM2-Tuning
基于ChatGLM2-6B进行微调，包括全参数、参数有效性、量化感知训练等，可实现指令微调、多轮对话微调等。
☆25Updated last year
ArtificialZeng / llama3_explained
the newest version of llama3，source code explained line by line using Chinese
☆22Updated last year
TemporaryLoRA / Temp-LoRA
☆105Updated last year
RUCAIBox / EASYEP
☆18Updated last month
akaihaoshuai / baby-llama2-chinese_cybertron
使用单个24G显卡，从0开始训练LLM
☆54Updated 2 weeks ago
Alibaba-NLP / LaRA
The code for LaRA Benchmark
☆35Updated last week
taishan1994 / sentencepiece_chinese_bpe
使用sentencepiece中BPE训练中文词表，并在transformers中进行使用。
☆118Updated last year
multimodal-art-projection / Megatron-LM-NEO
☆40Updated last year
owenliang / qwen-dpo
通义千问的DPO训练
☆48Updated 8 months ago
ssbuild / llm_finetuning
Large language Model fintuning bloom , opt , gpt, gpt2 ,llama,llama-2,cpmant and so on
☆97Updated last year
hhnqqq / GemmaLongText
☆16Updated last year
cjymz886 / LLM-RAG-QA
LLM+RAG for QA
☆22Updated last year
suu990901 / LLaMA-MiLe-Loss
Code for a New Loss for Mitigating the Bias of Learning Difficulties in Generative Language Models
☆62Updated 3 months ago
zysNLP / quickllm
A repo for update and debug Mixtral-7x8B、MOE、ChatGLM3、LLaMa2、 BaChuan、Qwen an other LLM models include new models mixtral, mixtral 8x7b, …
☆46Updated last week
ssbuild / qwen_finetuning
qwen models finetuning
☆98Updated 2 months ago
SkyworkAI / skywork-o1-prm-inference
☆63Updated 6 months ago
CASIA-LM / MoDS
☆141Updated last year
Academic-Hammer / HammerLLM
1.4B sLLM for Chinese and English - HammerLLM🔨
☆44Updated last year
linjh1118 / Llama3-Chinese-ORPO
基于Llama3，通过进一步CPT，SFT，ORPO得到的中文版Llama3
☆17Updated last year