826568389 / GRPO-R1
☆12Updated last month
Alternatives and similar repositories for GRPO-R1:
Users that are interested in GRPO-R1 are comparing it to the libraries listed below
- GoGPT中文指令数据集构造☆10Updated last year
- Recursive Abstractive Processing for Tree-Organized Retrieval☆11Updated 10 months ago
- 通用简单工具项目☆16Updated 6 months ago
- KDD 2024 AQA competition 2nd place solution☆11Updated 9 months ago
- the newest version of llama3,source code explained line by line using Chinese☆22Updated last year
- LLM+RAG for QA☆21Updated last year
- ☆21Updated 9 months ago
- LLM RAG 应用,支持 API 调用,语音交互。☆11Updated 9 months ago
- meta-comprehensive-rag-benchmark-kdd-cup-2024 phase1 task1 rank3☆17Updated 10 months ago
- ☆19Updated 9 months ago
- 天池算法比赛《BetterMixture - 大模型数据混合挑战赛》的第一名top1解决方案☆28Updated 9 months ago
- BLOOM 模型的指令微调☆24Updated last year
- 介绍docker、docker compose的使用。☆20Updated 7 months ago
- Qwen-WisdomVast is a large model trained on 1 million high-quality Chinese multi-turn SFT data, 200,000 English multi-turn SFT data, and …☆18Updated last year
- ☆33Updated last week
- Repo for for paper "AgentRE: An Agent-Based Framework for Navigating Complex Information Landscapes in Relation Extraction".☆64Updated 8 months ago
- 大语言模型训练和服务调研☆37Updated last year
- ☆23Updated last year
- KDD2024-WhoIsWho-Top3☆16Updated 10 months ago
- [ACL 2024 Findings] Learning Fine-Grained Grounded Citations for Attributed Large Language Models☆18Updated 5 months ago
- (NBCE)Naive Bayes-based Context Extension on ChatGLM-6b☆14Updated last year
- 基于 LoRA 和 P-Tuning v2 的 ChatGLM-6B 高效参数微调☆55Updated last year
- Knowledge-Reasoning Synergy Reinforcement Learning.☆34Updated last month
- 大语言模型应用:RAG、NL2SQL、聊天机器人、预训练、MOE混合专家模型、微调训练、强化学习、天池数据竞赛☆60Updated 2 months ago
- ☆16Updated 10 months ago
- ☆12Updated 7 months ago
- 一套代码指令微调大模型☆38Updated last year
- ☆15Updated last week
- 大型语言模型实战指南:应用实践与场景落地☆68Updated 7 months ago
- kaggle 2024 Eedi 第10名 金牌方案☆34Updated 3 months ago