XU-YIJIE / grpo-flatView external linksLinks
Train your grpo with zero dataset and low resources, 8bit/4bit/lora/qlora supported, multi-gpu supported ...
☆79Apr 30, 2025Updated 9 months ago
Alternatives and similar repositories for grpo-flat
Users that are interested in grpo-flat are comparing it to the libraries listed below
Sorting:
- Descriptions for simulation☆24Sep 28, 2025Updated 4 months ago
- ☆20Nov 3, 2024Updated last year
- ☆48Feb 10, 2025Updated last year
- This is the source code of FUSION, a safety-aware causal representation for generalizable driving agents.☆26Oct 23, 2024Updated last year
- Functional Optimal Transport: Map Estimation and Domain Adaptation for Functional data☆27Jun 7, 2021Updated 4 years ago
- [CHIL 2024] Interpretation of Intracardiac Electrograms Through Textual Representations☆12Sep 4, 2024Updated last year
- Financial Analysis and Algorithmic Trading Strategies in Python☆11Feb 16, 2023Updated 3 years ago
- Research on adversarial attacks and defenses for deep neural network 3D point cloud classifiers like PointNet and PointNet++.☆27May 22, 2020Updated 5 years ago
- Implementation of the model from "Faster sorting algorithms discovered using deep reinforcement learning" that discovered an all-new ult…☆11Aug 29, 2023Updated 2 years ago
- An open-ended, self-improving AI system that evolves its own source code using a local LLM. Built for autonomy, reflection, and code evol…☆21Jan 24, 2026Updated 3 weeks ago
- RL algorithm for stock trading with multiple reward functions☆11Apr 21, 2024Updated last year
- Turtlebot maze solver that senses environment through laser scans and navigates. A mini Project from Robot Ignite Academy Course☆10Aug 15, 2020Updated 5 years ago
- 浙大校徽全集,复制图片链接,方便在md中使用☆32Nov 17, 2024Updated last year
- ☆14Jul 4, 2022Updated 3 years ago
- FinanceGPT-B☆10Mar 26, 2024Updated last year
- ☆10Jul 21, 2019Updated 6 years ago
- Q-HEART: ECG Question Answering via Knowledge-Informed Multimodal LLMs (ECAI 2025)☆14Jan 23, 2026Updated 3 weeks ago
- About Code release for "Imagination Mechanism: Mesh Information Propagation for Enhancing Data Efficiency in Reinforcement Learning"☆13Oct 7, 2023Updated 2 years ago
- This project is focus on stock prediction,our goal is implementing one trading framework using DRL with LSTM.☆11Jun 1, 2018Updated 7 years ago
- Matrix Product State algorithm for computing characters of the symmetric group S_n☆11Sep 26, 2025Updated 4 months ago
- ☆13Apr 3, 2025Updated 10 months ago
- Open Source Tsetlin Machine framework☆17Oct 15, 2018Updated 7 years ago
- Background Subtraction for complex scenes such as intersections from surveillance cameras☆10Jul 15, 2022Updated 3 years ago
- ☆11Mar 6, 2022Updated 3 years ago
- ☆13May 25, 2023Updated 2 years ago
- Simple MoE - Day 17 of 365 Days of Repos☆16Jan 17, 2025Updated last year
- ☆12Nov 21, 2023Updated 2 years ago
- D3PE (Deep Data-Driven Policy Evaluation) aims to evaluation a large set of candidate policies from a fixed dataset to select best ones.☆11Jun 2, 2022Updated 3 years ago
- An LLM Agent allowing users to ask natural language questions and get answers about SEC fillings of their choice companies as well as fin…☆12Feb 27, 2024Updated last year
- Gathers machine learning and deep learning models for Reinforcement Learning☆10Sep 8, 2018Updated 7 years ago
- ☆13May 15, 2025Updated 9 months ago
- ☆13Jul 20, 2024Updated last year
- Professional Wargaming LLM Toolbox☆20Jul 9, 2025Updated 7 months ago
- ☆11Oct 6, 2020Updated 5 years ago
- High-quality reference implementations of various algorithms for Inverse Reinforcement Learning☆13Jun 20, 2018Updated 7 years ago
- Apply different deep learning models to limit order book.☆11Mar 6, 2018Updated 7 years ago
- Official Code Repo for the paper "Learning to Play Atari in a World of Tokens" accepted at ICML, 2024☆11Jun 6, 2024Updated last year
- Reinforcement learning crypto trading bot☆10Oct 30, 2020Updated 5 years ago
- In the high-frequency era of trading, orders of stocks can be executed under a millsecond. The information about the thousands of orders …☆10Mar 30, 2016Updated 9 years ago