Train your grpo with zero dataset and low resources, 8bit/4bit/lora/qlora supported, multi-gpu supported ...
☆79Apr 30, 2025Updated 11 months ago
Alternatives and similar repositories for grpo-flat
Users that are interested in grpo-flat are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- From Llama to Deepseek, grpo/mtp implemented. With pt/sft/lora/qlora included☆30Apr 21, 2025Updated 11 months ago
- NutritionMaster_ShiShenPro - "Pro Nutrition, Pro Life, Master Your Diet with ShiShenPro" 营养大师——食神Pro,专业营养,专业生活,与食神一起管理你的饮食菜谱”☆10Aug 20, 2024Updated last year
- Descriptions for simulation☆26Mar 30, 2026Updated last week
- Official code repository for CurricuLLM: Automatic Task Curricula Design for Learning Complex Robot Skills using Large Language Models☆27Sep 26, 2025Updated 6 months ago
- ☆11Nov 18, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A user-friendly interface built on top of Thinking Machines Tinker API that lets you fine-tune LLMs, chat with your trained model, and de…☆30Jan 31, 2026Updated 2 months ago
- ☆20Nov 3, 2024Updated last year
- SIGIR 2022 CODE☆10Apr 1, 2022Updated 4 years ago
- Functional Optimal Transport: Map Estimation and Domain Adaptation for Functional data☆27Jun 7, 2021Updated 4 years ago
- This Repo Contains Code to Finetune DETR on Custom Dataset☆14Oct 29, 2021Updated 4 years ago
- ☆18Feb 8, 2026Updated 2 months ago
- Official Code Repo for the paper "Learning to Play Atari in a World of Tokens" accepted at ICML, 2024☆11Jun 6, 2024Updated last year
- This is official project in our paper: Is Bigger and Deeper Always Better? Probing LLaMA Across Scales and Layers☆31Jan 13, 2024Updated 2 years ago
- ☆14Jul 4, 2022Updated 3 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Data collection from Moltbook for research☆48Mar 30, 2026Updated last week
- 官方transformers源码解析。AI大模型时代,pytorch、transformer是新操作系统,其他都是运行在其上面的软件。☆16Sep 25, 2023Updated 2 years ago
- ☆12Jun 14, 2019Updated 6 years ago
- NLP实验:新词挖掘+预训练模型继续Pre-training☆47Sep 15, 2023Updated 2 years ago
- ☆48Feb 10, 2025Updated last year
- D3PE (Deep Data-Driven Policy Evaluation) aims to evaluation a large set of candidate policies from a fixed dataset to select best ones.☆11Jun 2, 2022Updated 3 years ago
- ☆11Nov 1, 2022Updated 3 years ago
- ☆23May 4, 2020Updated 5 years ago
- LONGAGENT: Scaling Language Models to 128k Context through Multi-Agent Collaboration☆11Mar 11, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆11Aug 8, 2022Updated 3 years ago
- Virtual Data Augmentation: A Robust and General Framework for Fine-tuning Pre-trained Models☆16Sep 13, 2021Updated 4 years ago
- 基于bert4keras的SuperGLUE基准代码☆14Jun 25, 2022Updated 3 years ago
- Fine-Tune LLM Synthetic-Data application and "From Data to AGI: Unlocking the Secrets of Large Language Model"☆16Jul 5, 2024Updated last year
- ☆12Feb 6, 2021Updated 5 years ago
- Code for the paper "Reinforced Abstractive Summarization with Adaptive Length Controlling".☆11May 13, 2022Updated 3 years ago
- Implementation for "ROLL: Visual Self-Supervised Reinforcement Learning with Object Reasoning", CoRL 2020☆16Jun 22, 2022Updated 3 years ago
- 一个非常高效的字符串匹配工具,支持正向/反向最大匹配分词和多模式字符串精确匹配☆16Jul 29, 2023Updated 2 years ago
- [EMNLP-2025] R1-Zero on ANY TASK☆30Nov 9, 2025Updated 5 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Simple MLP for representing the SDF of a single shape☆17Jun 30, 2023Updated 2 years ago
- GEO 搜索引擎优化分析工具☆31Mar 4, 2026Updated last month
- We design a spectral compression mapping (SCM) for full-band speech enhancement, and propose a two-stage stream named MHA-DPCRN☆24Jul 4, 2022Updated 3 years ago
- ☆14Mar 11, 2022Updated 4 years ago
- Some tips on paper writing skills.☆14May 25, 2022Updated 3 years ago
- Code for paper "Incorporating Multimodal Information in Open-Domain Web Keyphrase Extraction"☆19Jan 28, 2021Updated 5 years ago
- Multiple Character Embeddings for Chinese Word Segmentation, ACL 2019☆17Sep 7, 2019Updated 6 years ago