Al-th / grpo_experimentLinks
Experiment on reimplementation of GRPO RL
☆17Updated last year
Alternatives and similar repositories for grpo_experiment
Users that are interested in grpo_experiment are comparing it to the libraries listed below
Sorting:
- A star for organising blocks and playing with transformers.☆23Updated last year
- Framework for specifying and proving properties—such as robustness, fairness, and interpretability—of machine learning models using Lean …☆79Updated 6 months ago
- Copies of prolog solvers for use from python☆19Updated last year
- fast combinations calculation in jax☆39Updated last year
- A tiny autograd engine with a Jax-like API☆74Updated 7 months ago
- A Full Transcript of the Lighthill Debate on AI from 1973, with Introductory Remarks☆33Updated last year
- Automatically extract executable programs from pruned mechanistic circuits, extending OpenAI's Sparse Circuits☆62Updated 2 months ago
- Proof of thought : LLM-based reasoning using Z3 theorem proving with multiple backend support (SMT2 and JSON DSL)☆364Updated 3 months ago
- ☆69Updated 3 months ago
- A package for defining deep learning models using categorical algebraic expressions.☆61Updated last year
- A MCP server for symbolic manipulation of mathematical expressions☆53Updated 7 months ago
- a categorical deep learning compiler☆207Updated 4 months ago
- Designing bridge trusses with Pytorch autograd☆61Updated 2 years ago
- ☆93Updated last week
- A playground to make it easy to try crazy things☆33Updated 2 months ago
- LLM verified with Monte Carlo Tree Search☆284Updated 10 months ago
- A library for building software agents using behavior trees and language models.☆90Updated last year
- time to learn mlx☆42Updated 4 months ago
- LeanAgent is a novel lifelong learning framework for formal theorem proving that continuously generalizes to and improves on ever-expandi…☆53Updated 7 months ago
- Automated Capability Discovery via Foundation Model Self-Exploration☆66Updated 11 months ago
- An OpenAI wrapper for PyReason to use in a Grid World reinforcement learning setting☆32Updated 2 years ago
- A programming language for formal/informal computation.☆43Updated last month
- This repository contain the simple llama3 implementation in pure jax.☆71Updated 11 months ago
- Code for the Fractured Entangled Representation Hypothesis position paper!☆221Updated 3 months ago
- A probabilistic approximate DNF counter☆39Updated 2 months ago
- Implement recursion using English as the programming language and an LLM as the runtime.☆240Updated 2 years ago
- Enjoy puzzle-solving directly in your browser.☆32Updated 9 months ago
- Rewriting Principia Mathematica in Lean☆138Updated last week
- Pytorch script hot swap: Change code without unloading your LLM from VRAM☆125Updated 9 months ago
- LLMs playing chess are sensitive to how the position came to be☆24Updated last year