Al-th / grpo_experimentLinks
Experiment on reimplementation of GRPO RL
☆13Updated 6 months ago
Alternatives and similar repositories for grpo_experiment
Users that are interested in grpo_experiment are comparing it to the libraries listed below
Sorting:
- Framework for specifying and proving properties—such as robustness, fairness, and interpretability—of machine learning models using Lean …☆64Updated 2 weeks ago
- Copies of prolog solvers for use from python☆18Updated last year
- A star for organising blocks and playing with transformers.☆23Updated last year
- Rewriting Principia Mathematica in Lean☆132Updated 8 months ago
- ☆65Updated 2 months ago
- Advanced Python Function Debugging with MCP Integration.☆57Updated last month
- Sequential Logic☆111Updated this week
- A Full Transcript of the Lighthill Debate on AI from 1973, with Introductory Remarks☆31Updated last year
- a categorical deep learning compiler☆203Updated 5 months ago
- A probabilistic approximate DNF counter☆37Updated 3 weeks ago
- Lamport's Bakery Algorithm Demonstrated in Python☆96Updated last year
- Analyzing hacker news in real-time with Bytewax and Proton☆39Updated last year
- A MCP server for symbolic manipulation of mathematical expressions☆35Updated last month
- A tiny autograd engine with a Jax-like API☆74Updated last month
- An easily-trained baby GPT that can stand in for the real thing. Based on Andrej Karpathy's makemore, but set up to mimic a llama-cpp ser…☆28Updated last year
- Full Automation of Goal-driven LLM Dialog Threads with And-Or Recursors and Refiner Oracles☆45Updated 3 weeks ago
- Tensor library & inference framework for machine learning☆108Updated this week
- 🪝"mnist" in 60 lines of code, no dependencies. For educational purposes.☆31Updated last year
- This is a numpy implementation of the Skip-gram algorithm described in Mikolov et al's Word2Vec paper. It is intended for didactic purpos…☆36Updated 2 years ago
- A library for building software agents using behavior trees and language models.☆83Updated 6 months ago
- A Low Barrier Proof Assistant☆120Updated last week
- A GPU Accelerated Binary Vector Store☆47Updated 5 months ago
- LLM plugin for pulling content from Hacker News☆116Updated 3 months ago
- A playground to make it easy to try crazy things☆33Updated 2 months ago
- Hierarchical topic segmentation of meeting transcripts using embeddings and divisive clustering.☆53Updated last year
- Extremely memory-efficient vector database☆71Updated 10 months ago
- ☆27Updated 11 months ago
- ☆19Updated last week
- PostgreSQL Prolog language handler☆134Updated last year
- Learn multi-variable optimization by creating a drawing assistant. No deep learning required!☆28Updated 2 years ago