cooper12121 / llama3-Chinese
pre-training llama3 using chinese
☆14Updated 11 months ago
Alternatives and similar repositories for llama3-Chinese:
Users that are interested in llama3-Chinese are comparing it to the libraries listed below
- 基于Llama3,通过进一步CPT,SFT,ORPO得到的中文版Llama3☆17Updated last year
- the newest version of llama3,source code explained line by line using Chinese☆22Updated last year
- XVERSE-7B: A multilingual large language model developed by XVERSE Technology Inc.☆52Updated last year
- A demo built on Megrez-3B-Instruct, integrating a web search tool to enhance the model's question-and-answer capabilities.☆37Updated 4 months ago
- ☆26Updated 6 months ago
- GPT+神器,简单实用的一站式AGI架构,内置本地化,LLM模型,agent,矢量数据库,智能链chain☆48Updated last year
- Its an open source LLM based on MOE Structure.☆58Updated 9 months ago
- Copy the MLP of llama3 8 times as 8 experts , created a router with random initialization,add load balancing loss to construct an 8x8b Mo…☆26Updated 9 months ago
- Qwen1.5-SFT(阿里, Ali), Qwen_Qwen1.5-2B-Chat/Qwen_Qwen1.5-7B-Chat微调(transformers)/LORA(peft)/推理☆57Updated 11 months ago
- ☆17Updated 10 months ago
- This repository provides an implementation of the paper "A Simple yet Effective Training-free Prompt-free Approach to Chinese Spelling Co…☆67Updated last month
- Tracking the hot Github repos and update daily 每天自动追踪Github热门项目☆48Updated this week
- Repo for for paper "AgentRE: An Agent-Based Framework for Navigating Complex Information Landscapes in Relation Extraction".☆65Updated 9 months ago
- 如需体验textin文档解析,请点击https://cc.co/16YSIy☆22Updated 9 months ago
- (撰写ing..)本仓库偏教程性质,以「模型中文化」为一个典型的模型训练问题切入场景,指导读者上手学习LLM二次微调训练。☆33Updated 8 months ago
- ☆37Updated last week
- Qwen-Efficient-Tuning☆43Updated last year
- ChatGLM2-6B微调, SFT/LoRA, instruction finetune☆107Updated last year
- ☆11Updated 8 months ago
- GRAIN: Gradient-based Intra-attention Pruning on Pre-trained Language Models☆19Updated last year
- flow mirror models from JZX AI Labs☆45Updated 6 months ago
- Qwen-WisdomVast is a large model trained on 1 million high-quality Chinese multi-turn SFT data, 200,000 English multi-turn SFT data, and …☆18Updated last year
- XVERSE-65B: A multilingual large language model developed by XVERSE Technology Inc.☆139Updated last year
- ☆140Updated 11 months ago
- 我们是第一个完全可商用的角色大模型。☆39Updated 8 months ago
- ☆91Updated last year
- 本项目用于大模型数学解题能力方面的数据集合成,模型训练及评测,相关文章记录。☆83Updated 7 months ago
- A repo for update and debug Mixtral-7x8B、MOE、ChatGLM3、LLaMa2、 BaChuan、Qwen an other LLM models include new models mixtral, mixtral 8x7b, …☆43Updated last month
- ☆36Updated 6 months ago
- qwen2 and llama3 cpp implementation☆44Updated 10 months ago