RethinkFun / trian_ppo
☆37Updated 4 months ago
Alternatives and similar repositories for trian_ppo:
Users that are interested in trian_ppo are comparing it to the libraries listed below
- ☆11Updated 3 months ago
- ☆67Updated 3 months ago
- ☆48Updated 6 months ago
- LLM Tokenizer with BPE algorithm☆29Updated 9 months ago
- 通义千问的DPO训练☆33Updated 5 months ago
- 训练一个对中文支持更好的LLaVA模型,并开源训练代码和数据。☆45Updated 5 months ago
- ☆102Updated 7 months ago
- ☆17Updated 6 months ago
- 包含程序员面试大厂面试题和面试经验☆116Updated last month
- ☆104Updated 3 months ago
- 使用单个24G显卡,从0开始训练LLM☆50Updated 3 months ago
- DeepSpeed教程 & 示例注释 & 学习笔记 (大模型高效训练)☆150Updated last year
- ☆29Updated 5 months ago
- 主要记录大语言大模型(LLMs) 算法(应用)工程师多模态相关知识☆121Updated 9 months ago
- LLM大模型(重点)以及搜广推等 AI 算法中手写的面试题,(非 LeetCode),比如 Self-Attention, AUC等,一般比 LeetCode 更考察一个人的综合能力,又更贴近业务和基础知识一点☆134Updated last month
- Inference code for LLaMA models☆113Updated last year
- A visuailzation tool to make deep understaning and easier debugging for RLHF training.☆148Updated this week
- pytorch分布式训练☆63Updated last year
- simple decoder-only GTP model in pytorch☆35Updated 9 months ago
- ☆39Updated 6 months ago
- ☆62Updated last month
- 本项目用于大模型数学解题能力方面的数据集 合成,模型训练及评测,相关文章记录。☆73Updated 5 months ago
- 大语言模型应用:RAG、NL2SQL、聊天机器人、预训练、MOE混合专家模型、微调训练、强化学习、天池数据竞赛☆56Updated last week
- personal chatgpt☆337Updated 2 months ago
- OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning☆106Updated last month
- ☆58Updated 11 months ago
- This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems☆49Updated last month