OpenRL-Lab / Ray_Tutorial
Tutorial for Ray
☆17Updated 9 months ago
Alternatives and similar repositories for Ray_Tutorial:
Users that are interested in Ray_Tutorial are comparing it to the libraries listed below
- A visuailzation tool to make deep understaning and easier debugging for RLHF training.☆106Updated last week
- ☆50Updated last month
- A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …☆64Updated 3 weeks ago
- SQUEEZED ATTENTION: Accelerating Long Prompt LLM Inference☆36Updated 2 months ago
- AI Alignment: A Comprehensive Survey☆133Updated last year
- Fira: Can We Achieve Full-rank Training of LLMs Under Low-rank Constraint?☆89Updated 2 months ago
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆128Updated 7 months ago
- ☆62Updated 3 months ago
- ☆48Updated last year
- A brief and partial summary of RLHF algorithms.☆89Updated last month
- Natural Language Reinforcement Learning☆68Updated last month
- A Really Scalable RL Framework to 10k+ CPUs☆22Updated 10 months ago
- A Telegram bot to recommend arXiv papers☆221Updated last week
- [NeurIPS 2024 D&B Track] GTA: A Benchmark for General Tool Agents☆59Updated 2 months ago
- ☆30Updated 4 months ago
- ☆43Updated 2 weeks ago
- Accepted LLM Papers in NeurIPS 2024☆33Updated 3 months ago
- Official implementation of ICML 2024 paper "ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking".☆44Updated 6 months ago
- OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning☆91Updated 3 weeks ago
- SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights☆45Updated 3 months ago
- ☆98Updated last month
- Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"☆11Updated 3 months ago
- Pretrain、decay、SFT a CodeLLM from scratch 🧙♂️☆36Updated 8 months ago
- ☆20Updated 6 months ago
- ☆92Updated 9 months ago
- A Suite for Parallel Inference of Diffusion Transformers (DiTs) on multi-GPU Clusters☆37Updated 5 months ago
- Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)☆167Updated last year
- Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" pr…☆85Updated 11 months ago
- Efficient, Flexible, and Highly Fault-Tolerant Model Service Management Based on SGLang☆30Updated 2 months ago
- A MoE impl for PyTorch, [ATC'23] SmartMoE☆61Updated last year