awjuliani / web-rl-playgroundView external linksLinks
An interactive web-based demonstration of fundamental tabular Reinforcement Learning (RL) algorithms in a simple grid world environment.
☆96Jun 4, 2025Updated 8 months ago
Alternatives and similar repositories for web-rl-playground
Users that are interested in web-rl-playground are comparing it to the libraries listed below
Sorting:
- Official Repository for Task-Circuit Quantization☆24Jun 1, 2025Updated 8 months ago
- ☆16Feb 22, 2025Updated 11 months ago
- Yet another coding assistant powered by LLM.☆16Sep 11, 2024Updated last year
- A tool for an analysis of LLM generations.☆42Oct 13, 2025Updated 4 months ago
- LLM4HWDesign Starting Toolkit☆19Oct 4, 2024Updated last year
- Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers☆27Mar 1, 2025Updated 11 months ago
- Project code for training LLMs to write better unit tests + code☆21May 19, 2025Updated 8 months ago
- Simulator of a basic order book flow and order execution☆18Mar 22, 2023Updated 2 years ago
- Spitzers Architecture School Urban Lab for Unit 26. This repository explores designing and codifying urban systems from the bottom up in …☆14Mar 29, 2022Updated 3 years ago
- Code for "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining"☆26Oct 14, 2025Updated 4 months ago
- ☆24Apr 3, 2025Updated 10 months ago
- Defeating the Training-Inference Mismatch via FP16☆182Nov 14, 2025Updated 3 months ago
- Implementation and datasets for "Training Language Models to Generate Quality Code with Program Analysis Feedback"☆40Jul 21, 2025Updated 6 months ago
- ☆17Aug 1, 2025Updated 6 months ago
- ☆22Nov 8, 2021Updated 4 years ago
- High frequency trading algorithm for Bitmex☆22Jun 22, 2020Updated 5 years ago
- An algorithmic trading robot written in Python.☆28Jun 4, 2017Updated 8 years ago
- Shopify Backend Developer Intern Challenge - Summer 2022☆11Jan 15, 2022Updated 4 years ago
- A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning☆283Sep 25, 2025Updated 4 months ago
- Unveiling the Economics of SQL Operations☆10Apr 21, 2024Updated last year
- Documentation at:☆10Dec 3, 2025Updated 2 months ago
- ☆35May 16, 2025Updated 8 months ago
- Learning from preferences is a common paradigm for fine-tuning language models. Yet, many algorithmic design decisions come into play. Ou…☆32Apr 20, 2024Updated last year
- ☆86Updated this week
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆128Oct 9, 2025Updated 4 months ago
- CrewAI-Agentic-Jira: Enhance your Jira workflows with intelligent agent-driven automation. Powered by the CrewAI framework, this project …☆21Feb 3, 2025Updated last year
- This repository contains source code and a high-quality test dataset for "Automated Commit Message Generation with Large Language Models.…☆10Nov 6, 2025Updated 3 months ago
- Kinematic and dynamic models of continuum and articulated soft robots.☆15Nov 22, 2025Updated 2 months ago
- This module includes functions that can be used to simulate mechanochemical phenomena.☆11Nov 16, 2021Updated 4 years ago
- [NeurIPS 2025 Spotlight] Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning☆151Sep 19, 2025Updated 4 months ago
- Martingale posterior neural networks for fast sequential decision making @ Neurips 2025☆22Nov 13, 2025Updated 3 months ago
- Code implementation for CoTexT: Multi-task Learning with Code-Text Transformer☆36Sep 14, 2021Updated 4 years ago
- The code for the paper "A Bayesian Approach to Online Planning" published in ICML 2024.☆13Jun 17, 2024Updated last year
- Interactive coding assistant for data scientists and machine learning developers, empowered by large language models.☆99Oct 8, 2024Updated last year
- An artificial matrix generator in C☆12Feb 16, 2023Updated 2 years ago
- ☆14Mar 21, 2024Updated last year
- ☆11May 18, 2023Updated 2 years ago
- ☆14Apr 14, 2025Updated 10 months ago
- Code for the paper "Faster Neural Network Training with Approximate Tensor Operations"☆10Oct 23, 2021Updated 4 years ago