Train your grpo with zero dataset and low resources, 8bit/4bit/lora/qlora supported, multi-gpu supported ...
☆80Apr 30, 2025Updated last year
Alternatives and similar repositories for grpo-flat
Users that are interested in grpo-flat are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- From Llama to Deepseek, grpo/mtp implemented. With pt/sft/lora/qlora included☆30Apr 21, 2025Updated last year
- "SCONE: A Novel Stochastic Sampling to Generate Contrastive Views and Hard Negative Samples for Recommendation", WSDM 2025☆17Nov 25, 2025Updated 5 months ago
- ☆12Apr 13, 2024Updated 2 years ago
- Official code repository for CurricuLLM: Automatic Task Curricula Design for Learning Complex Robot Skills using Large Language Models☆27Sep 26, 2025Updated 7 months ago
- ☆11Nov 18, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆20Nov 3, 2024Updated last year
- SIGIR 2022 CODE☆10Apr 1, 2022Updated 4 years ago
- Functional Optimal Transport: Map Estimation and Domain Adaptation for Functional data☆27Jun 7, 2021Updated 4 years ago
- Research on adversarial attacks and defenses for deep neural network 3D point cloud classifiers like PointNet and PointNet++.☆27May 22, 2020Updated 5 years ago
- Official Code Repo for the paper "Learning to Play Atari in a World of Tokens" accepted at ICML, 2024☆11Jun 6, 2024Updated last year
- ☆13Jul 2, 2025Updated 10 months ago
- This is the source code of FUSION, a safety-aware causal representation for generalizable driving agents.☆26Oct 23, 2024Updated last year
- Adaptive Hardness Negative Sampling for Collaborative Filtering, AAAI2024☆12Dec 13, 2023Updated 2 years ago
- Termius Pro 本地功能破解☆10May 11, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A collection of libraries for recommender systems