erxiong0 / chichi-gitbookLinks
Build Jekyll site with GitBook style!
☆14Updated 2 months ago
Alternatives and similar repositories for chichi-gitbook
Users that are interested in chichi-gitbook are comparing it to the libraries listed below
Sorting:
- To reproduce the experiments in Sutton's book☆14Updated 4 months ago
- EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL☆3,289Updated this week
- Reproduce R1 Zero on Logic Puzzle☆2,384Updated 4 months ago
- RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.☆2,203Updated this week
- minimal-cost for training 0.5B R1-Zero☆766Updated 2 months ago
- OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models☆1,810Updated 6 months ago
- adds Sequence Parallelism into LLaMA-Factory☆11Updated 7 months ago
- Distributed RL System for LLM Reasoning☆2,215Updated this week
- A very simple GRPO implement for reproducing r1-like LLM thinking.☆1,255Updated this week
- O1 Replication Journey☆1,999Updated 6 months ago
- ☆879Updated last month
- Large Reasoning Models☆804Updated 8 months ago
- Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (…☆9,182Updated this week
- 复现大模型相关算法及一些学习记录☆1,983Updated 2 weeks ago
- An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models☆1,651Updated this week
- Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL☆2,973Updated 3 weeks ago
- An Open-source RL System from ByteDance Seed and Tsinghua AIR☆1,494Updated 3 months ago
- An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Asy…☆7,600Updated this week
- Scalable RL solution for advanced reasoning of language models☆1,678Updated 4 months ago
- ☆94Updated 10 months ago
- Official Repo for Open-Reasoner-Zero☆2,020Updated 2 months ago
- Search-o1: Agentic Search-Enhanced Large Reasoning Models☆1,004Updated 2 months ago
- 欢迎来到 LLM-Dojo,这里是一个开源大模型学习场所,使用简洁且易阅读的代码构建模型训练框架(支持各种主流模型如Qwen、Llama、GLM等等)、RLHF框架(DPO/CPO/KTO/PPO)等各种功能。👩🎓👨🎓☆827Updated last month
- ☆955Updated 6 months ago
- Recipes to train reward model for RLHF.☆1,425Updated 3 months ago
- LMDeploy is a toolkit for compressing, deploying, and serving LLMs.☆6,832Updated this week
- Parsing-free RAG supported by VLMs☆763Updated 5 months ago
- Unify Efficient Fine-tuning of RAG Retrieval, including Embedding, ColBERT, ReRanker.☆995Updated last month
- R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning☆613Updated this week
- ☆545Updated 7 months ago