☆140Sep 29, 2024Updated last year
Alternatives and similar repositories for trian_ppo
Users that are interested in trian_ppo are comparing it to the libraries listed below
Sorting:
- ☆131Aug 8, 2024Updated last year
- coded with and corrected by Google Anti-Gravity☆13Nov 23, 2025Updated 3 months ago
- ☆94Nov 5, 2024Updated last year
- 通义千问的DPO训练☆63Sep 21, 2024Updated last year
- ☆33Jul 8, 2025Updated 8 months ago
- StrongSORT with Selective Feature Extraction Mechanism☆15Sep 25, 2024Updated last year
- Converted the training data of OpenVLA into general form of multimodal training instructions and then used with LLaVA-OneVision☆23Jan 12, 2025Updated last year
- Train deepseek r1-like reasoning LLM with ease | 轻松训练1个deepseek r1类的推理LLM☆18Feb 15, 2025Updated last year
- 复现大模型相关算法及一些学习记录☆3,023Feb 10, 2026Updated 3 weeks ago
- Long CoT Fine-Tuning and Reinforcement Learning for LLMs in the Context of the 24-Point Game: A Toy Project☆25Feb 22, 2025Updated last year
- Describes how to run DBFace, a real-time, single-shot face detection model on Intel OpenVINO☆29Aug 23, 2020Updated 5 years ago
- Official PyTorch implementation of "BroadFace: Looking at Tens of Thousands of People at Once for Face Recognition", ECCV 2020☆24Jun 14, 2021Updated 4 years ago
- ☆53Updated this week
- transformer 源码实现☆27Dec 17, 2024Updated last year
- Library for training process reward models☆29Jun 3, 2025Updated 9 months ago
- Quick Notebook Tutorials☆36Jul 17, 2025Updated 7 months ago
- Tiny-DeepSpeed, a minimalistic re-implementation of the DeepSpeed library☆50Aug 20, 2025Updated 6 months ago
- Agent CLI☆14Updated this week
- ☆12Aug 2, 2024Updated last year
- 这是一个open-r1的复现项目,对0.5B、1.5B、3B、7B的qwen模型进行GRPO训练,观察到一些有趣的现象。☆56Apr 13, 2025Updated 10 months ago
- 这是一个从头训练大语言模型的项目,包括预训练、微调和直接偏好优化,模型拥有1B参数,支持中英文。☆773Feb 18, 2025Updated last year
- ☆86Jul 24, 2025Updated 7 months ago
- RLCar Gazebo v2☆12Jun 28, 2024Updated last year
- 网络学习笔记,同步看板 https://github.com/orgs/apachecn/teams/diaosi☆11Jan 7, 2020Updated 6 years ago
- ☆14Nov 6, 2025Updated 4 months ago
- Operating System for your EON Gold☆13Dec 19, 2018Updated 7 years ago
- Implemention of lanenet model for real time lane detection using deep neural network model☆11Aug 13, 2018Updated 7 years ago
- Simple repo to finetune an LLM hosted on Hugging Face by creating a LORA☆11Dec 20, 2023Updated 2 years ago
- Package: Interactive Presentation Ninja☆10Jun 7, 2024Updated last year
- something for paper agent☆11Dec 18, 2024Updated last year
- CapsNet implementation in keras for R☆12May 8, 2018Updated 7 years ago
- yolo目标检测算法☆15Jul 27, 2025Updated 7 months ago
- The official github repo for the open online courses: "Dive into LLMs".☆10Mar 15, 2024Updated last year
- Python3 script to create Voronoi tessellations (mosaic pattern) on images☆10May 25, 2019Updated 6 years ago
- New version of mpMap☆12Jul 19, 2020Updated 5 years ago
- 爬取百度指数数据☆12Dec 8, 2022Updated 3 years ago
- simple decoder-only GTP model in pytorch☆43May 19, 2024Updated last year
- 本项目是一个围绕 DeepLearning.AI 出品的 Post-Training for LLMs 系列课程,为国内学习者量身打造的中文翻译与知识整理教程。项目提供课程内容翻译、知识点梳理和示例代码等内容,旨在降低语言门槛,让更多学生、研究人员和开发者系统掌握大语言模型…☆154Jan 4, 2026Updated 2 months ago
- 实现了Transformer中的几种位置编码方案☆44Oct 6, 2021Updated 4 years ago