rsshyam/GRPO-bandits

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/rsshyam/GRPO-bandits)

rsshyam / GRPO-bandits

☆13

Alternatives and similar repositories for GRPO-bandits

Users that are interested in GRPO-bandits are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MatthewKKai / MaLP
View on GitHub
Implementation Code for "LLM-based Medical Assistant Personalization with Short- and Long-Term Memory Coordination"
☆14May 17, 2026Updated 2 months ago
mrmaheshrajput / productionizing-llms
View on GitHub
Code Repository for Blog - How to Productionize Large Language Models (LLMs)
☆12Mar 27, 2024Updated 2 years ago
rsshyam / GRPO
View on GitHub
☆71Jul 28, 2024Updated 2 years ago
facebookresearch / PerSE
View on GitHub
Personalized Story Evaluation Model
☆17Nov 27, 2023Updated 2 years ago
mlcommons / dataperf
View on GitHub
Data Benchmarking
☆25May 24, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
WEIRDLabUW / vpl_llm
View on GitHub
Code and data for the paper "Understanding Hidden Context in Preference Learning: Consequences for RLHF"
☆27Aug 21, 2024Updated last year
nyankichi820 / appcrawler
View on GitHub
appstore and google play ranking and review crawler
☆11Jan 22, 2014Updated 12 years ago
CVxTz / llm-serve-tutorial
View on GitHub
☆20Apr 7, 2024Updated 2 years ago
zalanwastaken / guified
View on GitHub
Guified is a GUI library for LÖVE (Love2D) that simplifies window management and UI element creation. It allows developers to create inte…
☆16Apr 15, 2026Updated 3 months ago
CarperAI / nmmo-environment
View on GitHub
Neural MMO - A Massively Multiagent Environment for Artificial Intelligence Research
☆15May 30, 2024Updated 2 years ago
MonteFloyd / ECSCity
View on GitHub
High population city simulation in Unity ECS
☆12Jul 20, 2018Updated 8 years ago
chenllliang / MMEvalPro
View on GitHub
[NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs
☆25Sep 26, 2024Updated last year
psunlpgroup / FoVer
View on GitHub
This repository includes code and materials for the paper "Efficient PRM Training Data Synthesis via Formal Verification" (ACL 2026 Findi…
☆19Apr 7, 2026Updated 3 months ago
psuong / arcade-vehicle-controller
View on GitHub
An open source implementation of a vehicle controller using Unity's ECS.
☆16Mar 19, 2019Updated 7 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
hyunseoklee-ai / ReMoDetect
View on GitHub
ReMoDetect: Reward Models Recognize Aligned LLM's Generations (NeurIPS 2024)
☆17Nov 15, 2024Updated last year
jjiscool / zero
View on GitHub
A pixel-style rougelike RPG dungeon game in unity3d. Weapon Forging will be the primary element .
☆14Dec 19, 2016Updated 9 years ago
HdacTech / doc
View on GitHub
Hdac Document
☆23Jul 25, 2018Updated 8 years ago
ashishpatel26 / DataStructure-for-Data-Science
View on GitHub
Datastructure for data science
☆23Apr 12, 2024Updated 2 years ago
sculd / algorithmic_intraday_trading
View on GitHub
☆11Oct 6, 2020Updated 5 years ago
senyka0 / binance-options-arbitrage
View on GitHub
Script for trade arbitrage opportunities between European-style options and Perpetual futures, with notifications in telegram
☆11Jun 10, 2023Updated 3 years ago
davidvonthenen / open-virtual-assistant
View on GitHub
Open Source Virtual Assistant Framework
☆13Sep 4, 2025Updated 10 months ago
Zixir-lang / Zixir
View on GitHub
Zixir: a small, expression-oriented language and three-tier runtime (Elixir + Zig + Python) for agentic coding
☆18Mar 7, 2026Updated 4 months ago
morgynp / Resource-Based-Weapon-System
View on GitHub
Resource Based Weapon System template for Godot 4.3. Tutorial included!
☆12Jul 26, 2025Updated last year
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
dkw-aau / qse
View on GitHub
Quality Shapes Extraction from very large Knowledge Graphs
☆14Nov 15, 2025Updated 8 months ago
we-z / Orderbook-HFT
View on GitHub
Order Book Imbalance trading strategy
☆11Nov 21, 2022Updated 3 years ago
keinsell / Hftish
View on GitHub
Alpaca-based Order Book Inbalace Algorithm.
☆12Jul 23, 2020Updated 6 years ago
eigenpi / VHDL-Examples-from-Pong-Chu-Book
View on GitHub
This repository contains all the needed source files for several examples from Pong Chu's book: "Pong P. Chu, FPGA Prototyping by VHDL Ex…
☆11Apr 2, 2022Updated 4 years ago
The-Pocket / PocketFlow-Zig
View on GitHub
Pocket Flow: A minimalist LLM framework. Let Agents build Agents!
☆16Jan 26, 2026Updated 6 months ago
GemsLab / GLIMPSE-personalized-KGsummarization
View on GitHub
Personalized knowledge graph summarization based on historical queries
☆14Jun 17, 2020Updated 6 years ago
PGraphRAG-benchmark / PGraphRAG
View on GitHub
Personalized Graph-based Retrieval for LLMs Benchmark
☆34Feb 16, 2025Updated last year
mizchi / kagura
View on GitHub
2D-first game engine for MoonBit inspired by Ebiten
☆15Jul 21, 2026Updated last week
ygtxr1997 / CodingEveryday
View on GitHub
还是要多练
☆12Jul 31, 2020Updated 5 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Stanford-TML / HEAD_rl_deploy
View on GitHub
Official implementation of HEAD CoRL 2025
☆18Aug 9, 2025Updated 11 months ago
SparkJiao / MG-PFCM_outfit_rec
View on GitHub
Personalized Fashion Compatibility Modeling via Metapath-guided Heterogeneous Graph Learning.
☆16Nov 7, 2022Updated 3 years ago
DaLi-Jack / 3D-Tools
View on GitHub
☆10Oct 30, 2023Updated 2 years ago
VincentLongpre / ppo-trader
View on GitHub
Developing, training, and assessing the performance of a Proximal Policy Optimization (PPO) Stock Trading Agent.
☆14Aug 20, 2025Updated 11 months ago
olivierjeunen / pessimism-recsys-2021
View on GitHub
Source code for our paper "Pessimistic Decision-Making for Recommender Systems" published at ACM TORS, and RecSys 2021.
☆11Dec 15, 2022Updated 3 years ago
knowledgedefinednetworking / -knowledge-defined-networking
View on GitHub
☆12Oct 17, 2022Updated 3 years ago
likenneth / q_probe
View on GitHub
Q-Probe: A Lightweight Approach to Reward Maximization for Language Models
☆40Jun 10, 2024Updated 2 years ago