ljc010717 / GRPO2025
☆14Updated 3 weeks ago
Alternatives and similar repositories for GRPO2025
Users that are interested in GRPO2025 are comparing it to the libraries listed below
Sorting:
- ☆117Updated 8 months ago
- 大模型进阶面经☆48Updated last week
- Real-time updated, fine-grained reading list on LLM-synthetic-data.🔥☆256Updated 3 months ago
- ☆140Updated last year
- RAG 论文学习☆126Updated last month
- kaggle 2024 Eedi 第10名 金牌方案☆34Updated 4 months ago
- A Survey on Multimodal Retrieval-Augmented Generation☆172Updated 3 weeks ago
- Awesome Agent Training☆106Updated this week
- 对llama3进行全参微调、lora微调以及qlora微调。☆195Updated 7 months ago
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".☆119Updated 6 months ago
- WWW2025 Multimodal Intent Recognition for Dialogue Systems Challenge☆120Updated 6 months ago
- Latest Advances on Long Chain-of-Thought Reasoning☆298Updated last month
- ☆81Updated last year
- 通义千问的DPO训练☆47Updated 7 months ago
- llm & rl☆120Updated this week
- ☆42Updated 3 months ago
- An up-to-date curated list of Retrieval-Augmented Generation (RAG) for Large Language Models (LLMs).☆63Updated this week
- this is an implementation for the paper Improve Mathematical Reasoning in Language Models by Automated Process Supervision from google de…☆30Updated last month
- ☆81Updated 3 weeks ago
- ☆59Updated last week
- ☆132Updated 3 weeks ago
- [NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other mo…☆365Updated 8 months ago
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"☆116Updated 7 months ago
- 该仓库主要记录 大模型(LLMs) 算法工程师相关的面试题与我写的答案☆23Updated last year
- [ACL-2024]Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training☆26Updated 6 months ago
- Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains…☆209Updated 2 weeks ago
- Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation☆77Updated 6 months ago
- ☆133Updated 3 months ago
- ☆55Updated 7 months ago
- 《EasyOffer》(<大模型面经合集>)是针对LLM宝宝们量身打造的大模型暑期实习Offer指南,主要记录大模型暑期实习和秋招准备的一些常见大厂手撕代码、大厂面经经验、常见大厂思考题等;小白一个,正在学习ing......有问题各位大佬随时指正,希望大家都能拿到心仪Of…☆201Updated last month