Al-th / grpo_experimentLinks
Experiment on reimplementation of GRPO RL
☆15Updated 9 months ago
Alternatives and similar repositories for grpo_experiment
Users that are interested in grpo_experiment are comparing it to the libraries listed below
Sorting:
- Framework for specifying and proving properties—such as robustness, fairness, and interpretability—of machine learning models using Lean …☆72Updated 4 months ago
- Copies of prolog solvers for use from python☆19Updated last year
- A star for organising blocks and playing with transformers.☆23Updated last year
- A probabilistic approximate DNF counter☆37Updated 2 months ago
- A MCP server for symbolic manipulation of mathematical expressions☆43Updated 5 months ago
- isingLenzMC: Monte Carlo for Classical Ising Model (with core C library)☆53Updated 2 months ago
- A Full Transcript of the Lighthill Debate on AI from 1973, with Introductory Remarks☆33Updated last year
- Rewriting Principia Mathematica in Lean☆136Updated 2 months ago
- A tiny autograd engine with a Jax-like API☆74Updated 4 months ago
- a categorical deep learning compiler☆205Updated 2 months ago
- fast combinations calculation in jax☆39Updated last year
- A package for defining deep learning models using categorical algebraic expressions.☆61Updated last year
- ☆14Updated 2 years ago
- Proof of thought : LLM-based reasoning using Z3 theorem proving with multiple backend support (SMT2 and JSON DSL)☆357Updated last month
- ☆150Updated last week
- ☆68Updated last month
- LLM verified with Monte Carlo Tree Search☆283Updated 8 months ago
- LeanAgent is a novel lifelong learning framework for formal theorem proving that continuously generalizes to and improves on ever-expandi…☆43Updated 5 months ago
- A playground to make it easy to try crazy things☆33Updated this week
- An OpenAI wrapper for PyReason to use in a Grid World reinforcement learning setting☆31Updated last year
- Tensor library & inference framework for machine learning☆113Updated last month
- Model Context Protocol (MCP) server for constraint optimization and solving"☆140Updated 2 months ago
- Interpolate between embedding points with llm☆38Updated last year
- A library for building software agents using behavior trees and language models.☆89Updated 9 months ago
- Fun with wgpu: Simulating slime mold☆24Updated last year
- Advanced Python Function Debugging with MCP Integration.☆57Updated 5 months ago
- First-order logic theorem prover supporting unification with approximate vector similarity☆13Updated 2 years ago
- PILF: A IPWT-inspired bionic continual learning experiment focus on mitigate catastrophic forgetting with Surprise-gated Mixture of Exper…☆36Updated 4 months ago
- Pytorch script hot swap: Change code without unloading your LLM from VRAM☆125Updated 7 months ago
- ☆108Updated last year