ljc010717 / GRPO2025Links
☆23Updated 7 months ago
Alternatives and similar repositories for GRPO2025
Users that are interested in GRPO2025 are comparing it to the libraries listed below
Sorting:
- A live reading list for LLM data synthesis (Updated to July, 2025).☆408Updated 2 months ago
- RAG 论文学习☆176Updated 8 months ago
- ☆170Updated last year
- 对llama3进行全参微调、lora微调以及qlora微调。☆210Updated last year
- 该仓库主要记录 LLMs 算法 工程师相关的顶会论文研读笔记(多模态、PEFT、小样本QA问答、RAG、LMMs可解释性、Agents、CoT)☆367Updated last year
- llm & rl☆246Updated 3 weeks ago
- Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning☆881Updated 4 months ago
- kaggle 2024 Eedi 第10名 金牌方案☆43Updated 10 months ago
- A Survey on Multimodal Retrieval-Augmented Generation☆421Updated 2 weeks ago
- WWW2025 Multimodal Intent Recognition for Dialogue Systems Challenge☆127Updated last year
- Awesome List for Agentic RL☆542Updated 2 weeks ago
- 在verl上做reward的定制开发☆128Updated 6 months ago
- Reinforcement Learning in LLM and NLP.☆61Updated 2 months ago
- A One-Stop Reward Model Platform☆90Updated this week
- R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning☆653Updated 3 months ago
- ☆382Updated last month
- ☆549Updated 10 months ago
- personal chatgpt☆390Updated 11 months ago
- ☆268Updated 11 months ago
- ☆38Updated 3 months ago
- ☆423Updated last month
- ☆58Updated last year
- TinyRAG☆368Updated 4 months ago
- Latest Advances on Long Chain-of-Thought Reasoning☆554Updated 4 months ago
- RAG兴趣小组,全手写的一个RAG应用。Langchain的大部分库会很方便,但是你不一定理解其中原理,所以代码尽可能展现基本算法,主打理解RAG的原理☆241Updated last year
- [NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other mo…☆400Updated 4 months ago
- ☆104Updated 5 months ago
- 快速入门RAG与私有化部署☆211Updated last year
- ☆65Updated 6 months ago
- ☆11Updated last year