Al-th / grpo_experimentLinks
Experiment on reimplementation of GRPO RL
☆17Updated 10 months ago
Alternatives and similar repositories for grpo_experiment
Users that are interested in grpo_experiment are comparing it to the libraries listed below
Sorting:
- Framework for specifying and proving properties—such as robustness, fairness, and interpretability—of machine learning models using Lean …☆73Updated 5 months ago
- A star for organising blocks and playing with transformers.☆23Updated last year
- Copies of prolog solvers for use from python☆19Updated last year
- fast combinations calculation in jax☆39Updated last year
- An OpenAI wrapper for PyReason to use in a Grid World reinforcement learning setting☆32Updated 2 years ago
- A tiny autograd engine with a Jax-like API☆74Updated 5 months ago
- Designing bridge trusses with Pytorch autograd☆61Updated last year
- A package for defining deep learning models using categorical algebraic expressions.☆61Updated last year
- A programming language for formal/informal computation.☆42Updated 4 months ago
- Advanced Python Function Debugging with MCP Integration.☆57Updated 6 months ago
- isingLenzMC: Monte Carlo for Classical Ising Model (with core C library)☆53Updated 3 months ago
- Rewriting Principia Mathematica in Lean☆136Updated 3 months ago
- Proof of thought : LLM-based reasoning using Z3 theorem proving with multiple backend support (SMT2 and JSON DSL)☆365Updated 2 months ago
- A MCP server for symbolic manipulation of mathematical expressions☆47Updated 6 months ago
- a categorical deep learning compiler☆206Updated 3 months ago
- A playground to make it easy to try crazy things☆33Updated last month
- ☆68Updated 2 months ago
- Repo for solving arc problems with an Neural Cellular Automata☆23Updated 7 months ago
- A library for building software agents using behavior trees and language models.☆90Updated 10 months ago
- ☆75Updated last year
- Geometric Algebra package for JAX☆54Updated 4 years ago
- LLM verified with Monte Carlo Tree Search☆284Updated 9 months ago
- A Full Transcript of the Lighthill Debate on AI from 1973, with Introductory Remarks☆33Updated last year
- Full Automation of Goal-driven LLM Dialog Threads with And-Or Recursors and Refiner Oracles☆44Updated 3 months ago
- ☆109Updated last year
- LeanAgent is a novel lifelong learning framework for formal theorem proving that continuously generalizes to and improves on ever-expandi…☆46Updated 6 months ago
- Lernd is ∂ILP (dILP) framework implementation based on Deepmind's paper Learning Explanatory Rules from Noisy Data.☆26Updated 2 years ago
- First-Order Probabilistic Programming Language☆29Updated 6 years ago
- A probabilistic approximate DNF counter☆39Updated last month
- ☆74Updated 3 years ago