awjuliani / web-rl-playgroundLinks
An interactive web-based demonstration of fundamental tabular Reinforcement Learning (RL) algorithms in a simple grid world environment.
☆93Updated 7 months ago
Alternatives and similar repositories for web-rl-playground
Users that are interested in web-rl-playground are comparing it to the libraries listed below
Sorting:
- Fetch arxiv data to LLM-friendly text☆128Updated last month
- A Deep Research agent from scratch☆214Updated 8 months ago
- ☆54Updated last year
- RLHF (Supervised fine-tuning, reward model, and PPO) step-by-step in 3 Jupyter notebooks☆230Updated 7 months ago
- ☆169Updated last year
- ☆81Updated 4 months ago
- alphaxiv open source alternative☆106Updated 7 months ago
- ☆264Updated 2 months ago
- ☆80Updated 9 months ago
- Learning records for building a large language model from scratch☆58Updated last year
- Challenges for general-purpose web-browsing AI agents☆67Updated 7 months ago
- ☆57Updated 11 months ago
- ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations [COLM 2025]☆249Updated 6 months ago
- 📰 Building News Agents to Summarize News with MCP, Q, and tmux☆309Updated 6 months ago
- A transformer-based multimodal model for music.☆29Updated last year
- Curated resources for discovering, reading, and working with arXiv papers☆383Updated 7 months ago
- SkillWeaver is a framework to enable web agent self-improvement through environment exploration and skill synthesis.☆106Updated 9 months ago
- ☆48Updated 11 months ago
- Voice-Enabled Math Tutor Powered by Groq that Calculates and Renders Live Problems and Instruction with LaTeX in Seconds!☆239Updated 3 weeks ago
- support BM25+vecetor☆29Updated 7 months ago
- ☆74Updated last year
- Control drones with natural language☆165Updated last month
- A pure MLX-based training pipeline for fine-tuning LLMs using GRPO on Apple Silicon.☆225Updated 2 months ago
- ☆211Updated last week
- [EMNLP 2025 Demo] TinyScientist: A Lightweight Framework for Building Research Agents☆126Updated 2 months ago
- Recursive Language Models (RLMs) implementation based on the paper by Zhang, Kraska, and Khattab☆144Updated 2 weeks ago
- Repo housing the open sourced code for the ai2 scholar qa app and also the corresponding library☆246Updated this week
- Chrome / Edge extension to turn arXiv papers into Markdown codes in one click.☆88Updated 10 months ago
- [EMNLP 2025] The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"☆102Updated 4 months ago
- ☆173Updated 5 months ago