awjuliani / web-rl-playgroundLinks

An interactive web-based demonstration of fundamental tabular Reinforcement Learning (RL) algorithms in a simple grid world environment.

☆71

Alternatives and similar repositories for web-rl-playground

Users that are interested in web-rl-playground are comparing it to the libraries listed below

Sorting:

jerpint / arxiv-txt
Fetch arxiv data to LLM-friendly text
☆123Updated 5 months ago
liyuan24 / nanoDeepResearch
A Deep Research agent from scratch
☆201Updated 2 months ago
vtabbott / Neural-Circuit-Diagrams
☆166Updated last year
ash80 / RLHF_in_notebooks
RLHF (Supervised fine-tuning, reward model, and PPO) step-by-step in 3 Jupyter notebooks
☆183Updated last month
yhlleo / LLMs_from_scratch
Learning records for building a large language model from scratch
☆54Updated 7 months ago
rodmarkun / SmolML
A fully functional and simple Machine Learning library made entirely from scratch with Python.
☆295Updated this week
MuiscNN / Lamucal
A transformer-based multimodal model for music.
☆29Updated 11 months ago
maitrix-org / easyweb
☆77Updated 3 months ago
allenai / ai2-scholarqa-lib
Repo housing the open sourced code for the ai2 scholar qa app and also the corresponding library
☆206Updated last week
sheryc / arxiv-markdown-parser-plugin
Chrome / Edge extension to turn arXiv papers into Markdown codes in one click.
☆79Updated 4 months ago
multimodal-art-projection / DailyPaper
☆54Updated 8 months ago
OSU-NLP-Group / SkillWeaver
SkillWeaver is a framework to enable web agent self-improvement through environment exploration and skill synthesis.
☆91Updated 3 months ago
willkurt / token-explorer
A simple tool that let's you explore different possible paths that an LLM might sample.
☆180Updated 3 months ago
ninehills / countdown
Countdown Game Distill&RL
☆46Updated 3 months ago
Vaibhavs10 / hf-llm.rs
☆206Updated 6 months ago
allenai / codescientist
CodeScientist: An automated scientific discovery system for code-based experiments
☆288Updated last month
Zyphra / transformers_zamba2
☆48Updated 6 months ago
eugeneyan / news-agents
📰 Building News Agents to Summarize News with MCP, Q, and tmux
☆293Updated 2 weeks ago
ArturTanona / grpo_unsloth_docker
☆57Updated 5 months ago
jina-ai / node-serp
LLM-as-SERP
☆68Updated 5 months ago
patrickloeber / llm-data-scrapers
A list of useful Open Source tools and scrapers to gather data for LLMs
☆237Updated 5 months ago
commit-0 / commit0
Commit0: Library Generation from Scratch
☆161Updated 3 months ago
StigLidu / DualDistill
The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"
☆86Updated 2 weeks ago
sigridjineth / muvera-py
Python Implementation of MUVERA (Multi-Vector Retrieval via Fixed Dimensional Encodings)
☆278Updated last month
facebookresearch / ZeroSumEval
A framework for pitting LLMs against each other in an evolving library of games ⚔
☆34Updated 3 months ago
evangelosmeklis / deepdrone
An AI agent to control drones from your CLI
☆122Updated last week
facebookresearch / llm-speedrunner
The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…
☆94Updated last week
kyryl-opens-ml / no-ocr
https://no-ocr.com/about
☆163Updated last month
hellangleZ / BY_RAG_V2
support BM25+vecetor
☆29Updated 2 months ago
appl-team / reppl
Using APPL to reimplement popular algorithms for Large Language Models (LLMs) and prompts
☆45Updated 6 months ago