Al-th / grpo_experimentLinks

Experiment on reimplementation of GRPO RL

☆13

Alternatives and similar repositories for grpo_experiment

Users that are interested in grpo_experiment are comparing it to the libraries listed below

Sorting:

fraware / leanverifier
Framework for specifying and proving properties—such as robustness, fairness, and interpretability—of machine learning models using Lean …
☆63Updated last week
philzook58 / prologsolvers
Copies of prolog solvers for use from python
☆18Updated last year
rayking99 / BlockStar
A star for organising blocks and playing with transformers.
☆23Updated last year
gvelesandro / constructor-theory-simulator
☆64Updated last month
ndrwnaguib / principia
Rewriting Principia Mathematica in Lean
☆132Updated 8 months ago
bytewax / hacking-hacker-news
Analyzing hacker news in real-time with Bytewax and Proton
☆39Updated last year
colehaus / hammock-public
Visualize text embeddings
☆40Updated 2 years ago
joelburget / microjax
A tiny autograd engine with a Jax-like API
☆71Updated 2 weeks ago
kordless / gnosis-mystic
Advanced Python Function Debugging with MCP Integration.
☆57Updated last month
NeilBotelho / turboAsync
A multithreaded async event loop for python
☆58Updated 9 months ago
rodlaf / BinaryGPUIndex
A GPU Accelerated Binary Vector Store
☆47Updated 5 months ago
Dicklesworthstone / the_lighthill_debate_on_ai
A Full Transcript of the Lighthill Debate on AI from 1973, with Introductory Remarks
☆31Updated last year
dmf-archive / PILF
PILF: A IPWT-inspired bionic continual learning experiment focus on mitigate catastrophic forgetting with Surprise-gated Mixture of Exper…
☆33Updated last week
sdiehl / sympy-mcp
A MCP server for symbolic manipulation of mathematical expressions
☆34Updated 3 weeks ago
cjdrake / seqlogic
Sequential Logic
☆111Updated last week
D-Star-AI / minDB
Extremely memory-efficient vector database
☆71Updated 10 months ago
spather / transformer-experiments
Some experiments on transformer models
☆11Updated last year
tatut / pgprolog
PostgreSQL Prolog language handler
☆134Updated last year
AugmendTech / treeseg
Hierarchical topic segmentation of meeting transcripts using embeddings and divisive clustering.
☆53Updated 11 months ago
Dicklesworthstone / bakery_algorithm
Lamport's Bakery Algorithm Demonstrated in Python
☆96Updated last year
BobMcDear / trap
Autoregressive transformers in APL
☆102Updated 2 months ago
eschluntz / PytorchBridge
Designing bridge trusses with Pytorch autograd
☆61Updated last year
Dicklesworthstone / grassmann_article
☆51Updated last year
jhud / mygpt
An easily-trained baby GPT that can stand in for the real thing. Based on Andrej Karpathy's makemore, but set up to mimic a llama-cpp ser…
☆28Updated last year
statusfailed / catgrad
a categorical deep learning compiler
☆203Updated 4 months ago
zerocorebeta / Option-K
☆27Updated 10 months ago
jmward01 / lmplay
A playground to make it easy to try crazy things
☆33Updated last month
Tsadoq / ErisForge
Dead Simple LLM Abliteration
☆224Updated 5 months ago
justinmeiners / why-train-when-you-can-optimize
Learn multi-variable optimization by creating a drawing assistant. No deep learning required!
☆28Updated 2 years ago
zby / LLMEasyTools
Tools for LLM agents.
☆63Updated 7 months ago