ljc010717 / GRPO2025Links
☆22Updated 2 months ago
Alternatives and similar repositories for GRPO2025
Users that are interested in GRPO2025 are comparing it to the libraries listed below
Sorting:
- ☆133Updated 9 months ago
- Real-time updated, fine-grained reading list on LLM-synthetic-data.🔥☆262Updated 5 months ago
- kaggle 2024 Eedi 第10名 金牌方案☆35Updated 5 months ago
- llm & rl☆151Updated this week
- 对llama3进行全参微调、lora微调以及qlora微调。☆199Updated 8 months ago
- ☆141Updated last year
- ☆63Updated last month
- ☆241Updated 2 weeks ago
- ☆31Updated 10 months ago
- ☆41Updated 10 months ago
- ☆83Updated 4 months ago
- 大语言模型应用:RAG、NL2SQL、聊天机器人、预训练、MOE混合专家模型、微调训练、强化学习、天池数据竞赛☆62Updated 4 months ago
- Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains…☆228Updated 3 weeks ago
- ☆85Updated 2 weeks ago
- Awesome Agent Training☆164Updated this week
- RAG 论文学习☆142Updated 3 months ago
- 在verl上做reward的定制开发☆54Updated last month
- 大模型进阶面经☆52Updated last month
- [NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other mo…☆373Updated 9 months ago
- Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning☆155Updated last week
- WWW2025 Multimodal Intent Recognition for Dialogue Systems Challenge☆120Updated 7 months ago
- 受到self-instruct启发,除了通用LLM还能做垂直领域的小LLM实现定制效果,通过GPT获得question和answer来作为训练数据☆15Updated 2 years ago
- This is the reading list for the survey "A Survey on the Optimization of LLM-based Agents ". We will keep adding papers and improving the…☆115Updated last month
- 使用单个24G显卡,从0开始训练LLM☆55Updated last month
- ☆82Updated last year
- 快速入门RAG与私有化部署☆191Updated last year
- ☆222Updated last week
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".☆122Updated 7 months ago
- Official code implementation for the ACL 2025 paper: 'CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis'☆27Updated last month
- An Awesome List of Reinforcement Learning-based Large Language Agent Works. Collect directly from official code base.☆154Updated this week