lansinuote / Simple_TRLLinks
☆18Updated 10 months ago
Alternatives and similar repositories for Simple_TRL
Users that are interested in Simple_TRL are comparing it to the libraries listed below
Sorting:
- 基于DPO算法微调语言大模型,简单好上手。☆39Updated 11 months ago
- 通义千问的DPO训练☆48Updated 9 months ago
- ☆85Updated last week
- 使用单个24G显卡,从0开始训练LLM☆55Updated last month
- ☆111Updated 11 months ago
- ☆69Updated last year
- ☆82Updated 8 months ago
- 在verl上做reward的定制开发☆54Updated last month
- Train your grpo with zero dataset and low resources, 8bit/4bit/lora/qlora supported, multi-gpu supported ...☆73Updated last month
- Inference code for LLaMA models☆121Updated last year
- personal chatgpt☆373Updated 6 months ago
- ChatGLM-6B添加了RLHF的实现,以及部分核心代码的逐行讲解 ,实例部分是做了个新闻短标题的生成,以及指定context推荐的RLHF的实现☆86Updated last year
- ☆43Updated 10 months ago
- 本项目是自动化学报中AUTOPLAN的代码地址,使用大语言模型完成了复杂任务的任务规划以及任务执行☆101Updated 7 months ago
- 阿里通义千问(Qwen-7B-Chat/Qwen-7B), 微调/LORA/推理☆105Updated last year
- ☆83Updated 4 months ago
- llm & rl☆151Updated this week
- ☆72Updated last month
- Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat☆114Updated 2 years ago
- llama,chatglm 等模型的微调☆89Updated 11 months ago
- 天池算法比赛《BetterMixture - 大模型数据混合挑战赛》的第一名top1解决方案☆31Updated 11 months ago
- 怎么训练一个LLM分词器☆150Updated last year
- stay tuned.☆16Updated last week
- 本项目用于大模型数学解题能力方面的数据集合成,模型训练及评测,相关 文章记录。☆88Updated 9 months ago
- 包含程序员面试大厂面试题和面试经验☆137Updated last month
- ☆141Updated last year
- 快速入门RAG与私有化部署☆191Updated last year
- ☆44Updated 4 months ago
- Official Repository for SIGIR2024 Demo Paper "An Integrated Data Processing Framework for Pretraining Foundation Models"☆81Updated 9 months ago
- Reinforcement Learning in LLM and NLP.☆39Updated this week