Train your grpo with zero dataset and low resources, 8bit/4bit/lora/qlora supported, multi-gpu supported ...
☆79Apr 30, 2025Updated 10 months ago
Alternatives and similar repositories for grpo-flat
Users that are interested in grpo-flat are comparing it to the libraries listed below
Sorting:
- Descriptions for simulation☆24Sep 28, 2025Updated 5 months ago
- [NeurIPS 2023] LMC: Large Model Collaboration with Cross-assessment for Training-Free Open-Set Object Recognition☆19May 26, 2024Updated last year
- ☆12Nov 18, 2023Updated 2 years ago
- ☆20Nov 3, 2024Updated last year
- Functional Optimal Transport: Map Estimation and Domain Adaptation for Functional data☆27Jun 7, 2021Updated 4 years ago
- ☆13Jul 2, 2025Updated 8 months ago
- This is the source code of FUSION, a safety-aware causal representation for generalizable driving agents.☆26Oct 23, 2024Updated last year
- ☆11Feb 1, 2023Updated 3 years ago
- Adaptive Hardness Negative Sampling for Collaborative Filtering, AAAI2024☆12Dec 13, 2023Updated 2 years ago
- This is official project in our paper: Is Bigger and Deeper Always Better? Probing LLaMA Across Scales and Layers☆31Jan 13, 2024Updated 2 years ago
- ☆11Mar 6, 2022Updated 4 years ago
- ☆16Jun 10, 2025Updated 9 months ago
- ☆10Jan 7, 2022Updated 4 years ago
- ☆14Jul 4, 2022Updated 3 years ago
- [CHIL 2024] Interpretation of Intracardiac Electrograms Through Textual Representations☆12Sep 4, 2024Updated last year
- 官方transformers源码解析。AI大模型时代,pytorch、transformer是新操作系统,其他都是运行在其上面的软件。☆16Sep 25, 2023Updated 2 years ago
- NLP实验:新词挖掘+预训练模型继续Pre-training☆47Sep 15, 2023Updated 2 years ago
- ☆48Feb 10, 2025Updated last year
- Ground-Aware Point Cloud Semantic Segmentation for Autonomous Driving. ACM Multimedia 2019.☆12Sep 19, 2019Updated 6 years ago
- Official repository for FedPerfix: Towards Partial Model Personalization of Vision Transformers in Federated Learning (ICCV2023)☆20Dec 1, 2023Updated 2 years ago
- Presentations & Notes☆11May 14, 2022Updated 3 years ago
- D3PE (Deep Data-Driven Policy Evaluation) aims to evaluation a large set of candidate policies from a fixed dataset to select best ones.☆11Jun 2, 2022Updated 3 years ago
- 该项目是为了使用layoutlmv3针对中文图片训练和推理。 其中主要解决三个问题: 1.数据标准化成可以的训练数据集格式 2.layoutlmv3-base-chinese 分词修改 2.超过512长度的文本切分和滑窗操作☆63Sep 6, 2024Updated last year
- Classify image and text with ResNet and BERT models using Pytorch☆13Jul 7, 2020Updated 5 years ago
- [EMNLP 2021] Code for our EMNLP 2021 paper “Heterogeneous Graph Neural Networks for Keyphrase Generation”☆14Nov 13, 2021Updated 4 years ago
- reimplementing Neural Summarization by Extracting Sentences and Words☆16Dec 12, 2018Updated 7 years ago
- Dataset2024☆12Jun 12, 2025Updated 9 months ago
- ⚠️ ARCHIVED - All development moved to https://github.com/itbench-hub/ITBench/tree/main/scenarios☆15Feb 24, 2026Updated 3 weeks ago
- LONGAGENT: Scaling Language Models to 128k Context through Multi-Agent Collaboration☆11Mar 11, 2024Updated 2 years ago
- 基于Bart语言模型的指针生成网络,用于中文语法纠错任务☆16Sep 8, 2022Updated 3 years ago
- 基于bert4keras的SuperGLUE基准代码☆14Jun 25, 2022Updated 3 years ago
- Virtual Data Augmentation: A Robust and General Framework for Fine-tuning Pre-trained Models☆16Sep 13, 2021Updated 4 years ago
- Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision☆19Apr 1, 2025Updated 11 months ago
- Fine-Tune LLM Synthetic-Data application and "From Data to AGI: Unlocking the Secrets of Large Language Model"☆16Jul 5, 2024Updated last year
- ☆12Feb 6, 2021Updated 5 years ago
- Code for the paper "Reinforced Abstractive Summarization with Adaptive Length Controlling".☆11May 13, 2022Updated 3 years ago
- Code tasks for NJU TSA (Time Series Analysis) course☆13Jan 24, 2024Updated 2 years ago
- [EMNLP-2025] R1-Zero on ANY TASK☆30Nov 9, 2025Updated 4 months ago
- [ICML 2025] Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search☆109Jun 3, 2025Updated 9 months ago