awjuliani / web-rl-playgroundLinks
An interactive web-based demonstration of fundamental tabular Reinforcement Learning (RL) algorithms in a simple grid world environment.
☆79Updated 4 months ago
Alternatives and similar repositories for web-rl-playground
Users that are interested in web-rl-playground are comparing it to the libraries listed below
Sorting:
- Fetch arxiv data to LLM-friendly text☆125Updated 8 months ago
- A Deep Research agent from scratch☆212Updated 5 months ago
- RLHF (Supervised fine-tuning, reward model, and PPO) step-by-step in 3 Jupyter notebooks☆209Updated 4 months ago
- ☆77Updated 2 months ago
- AlphaXIV open-source alternative: Chat with any arXiv paper.☆87Updated 5 months ago
- A fully functional and simple Machine Learning library made entirely from scratch with Python.☆306Updated last week
- ☆169Updated last year
- ☆77Updated 6 months ago
- LLM-as-SERP☆71Updated 7 months ago
- 📰 Building News Agents to Summarize News with MCP, Q, and tmux☆307Updated 3 months ago
- Curated resources for discovering, reading, and working with arXiv papers☆347Updated 4 months ago
- ☆55Updated 11 months ago
- ☆259Updated 2 months ago
- Python Implementation of MUVERA (Multi-Vector Retrieval via Fixed Dimensional Encodings)☆330Updated 3 months ago
- Countdown Game Distill&RL☆47Updated last month
- A pure MLX-based training pipeline for fine-tuning LLMs using GRPO on Apple Silicon.☆46Updated this week
- Conversation logs with Claude 3.5 Sonnet to try and iteratively optimize code☆99Updated 9 months ago
- ☆57Updated 8 months ago
- ☆136Updated 2 months ago
- Code and data for the paper "Why think step by step? Reasoning emerges from the locality of experience"☆62Updated 6 months ago
- [EMNLP 2025] The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"☆101Updated 2 months ago
- Auto Thinking Mode switch for Qwen3 in Open webui☆68Updated 5 months ago
- Using APPL to reimplement popular algorithms for Large Language Models (LLMs) and prompts☆45Updated 9 months ago
- ☆49Updated 8 months ago
- SkillWeaver is a framework to enable web agent self-improvement through environment exploration and skill synthesis.☆98Updated 6 months ago
- Repo housing the open sourced code for the ai2 scholar qa app and also the corresponding library☆232Updated 2 weeks ago
- Open-source autonomous cleaning & housekeeping robot☆237Updated 3 months ago
- An AI agent to control drones from your CLI☆133Updated 2 months ago
- ☆68Updated 3 weeks ago
- A simple tool that let's you explore different possible paths that an LLM might sample.☆190Updated 5 months ago