OpenRL-Lab / Ray_TutorialLinks
Tutorial for Ray
☆29Updated last year
Alternatives and similar repositories for Ray_Tutorial
Users that are interested in Ray_Tutorial are comparing it to the libraries listed below
Sorting:
- A MoE impl for PyTorch, [ATC'23] SmartMoE☆70Updated 2 years ago
- A lightweight reinforcement learning framework that integrates seamlessly into your codebase, empowering developers to focus on algorithm…☆68Updated last month
- 青稞Talk☆148Updated last week
- siiRL: Shanghai Innovation Institute RL Framework for Advanced LLMs and Multi-Agent Systems☆217Updated this week
- ☆77Updated last month
- MiroRL is an MCP-first reinforcement learning framework for deep research agent.☆162Updated last month
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆137Updated last year
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆142Updated 5 months ago
- A visuailzation tool to make deep understaning and easier debugging for RLHF training.☆256Updated 7 months ago
- Implementation for FP8/INT8 Rollout for RL training without performence drop.☆242Updated this week
- DeepSeek Native Sparse Attention pytorch implementation☆97Updated last month
- Tiny-DeepSpeed, a minimalistic re-implementation of the DeepSpeed library☆45Updated last month
- ☆33Updated 6 months ago
- RLHF experiments on a single A100 40G GPU. Support PPO, GRPO, REINFORCE, RAFT, RLOO, ReMax, DeepSeek R1-Zero reproducing.☆73Updated 7 months ago
- qwen-nsa☆75Updated 5 months ago
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning☆190Updated 6 months ago
- ☆71Updated 4 months ago
- ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with 18GB GPU Memory☆188Updated 2 months ago
- Efficient Mixture of Experts for LLM Paper List☆131Updated this week
- Scaling Preference Data Curation via Human-AI Synergy☆114Updated 3 months ago
- HFAI deep learning models☆152Updated 2 years ago
- 模型压缩的小白入门教程☆22Updated last year
- This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems☆91Updated 6 months ago
- "what, how, where, and how well? a survey on test-time scaling in large language models" repository☆70Updated this week
- DeepSpeed教程 & 示例注释 & 学习笔记 (大模型高效训练)☆178Updated 2 years ago
- 训练一个对中文支持更好的LLaVA模型,并开源训练代码和数据。☆72Updated last year
- analyse problems of AI with Math and Code☆25Updated 2 months ago
- Tiny-Megatron, a minimalistic re-implementation of the Megatron library☆17Updated last month
- A High-Efficiency System of Large Language Model Based Search Agents☆74Updated 3 months ago
- ☆202Updated 5 months ago