awjuliani / web-rl-playgroundLinks
An interactive web-based demonstration of fundamental tabular Reinforcement Learning (RL) algorithms in a simple grid world environment.
☆89Updated 6 months ago
Alternatives and similar repositories for web-rl-playground
Users that are interested in web-rl-playground are comparing it to the libraries listed below
Sorting:
- Fetch arxiv data to LLM-friendly text☆128Updated last week
- A Deep Research agent from scratch☆212Updated 6 months ago
- A fully functional and simple Machine Learning library made entirely from scratch with Python.☆323Updated 3 weeks ago
- ☆168Updated last year
- ☆55Updated last year
- 📰 Building News Agents to Summarize News with MCP, Q, and tmux☆307Updated 4 months ago
- ☆262Updated last month
- Curated resources for discovering, reading, and working with arXiv papers☆357Updated 6 months ago
- RLHF (Supervised fine-tuning, reward model, and PPO) step-by-step in 3 Jupyter notebooks☆221Updated 5 months ago
- AlphaXIV open-source alternative: Chat with any arXiv paper.☆104Updated 6 months ago
- A transformer-based multimodal model for music.☆29Updated last year
- A pure MLX-based training pipeline for fine-tuning LLMs using GRPO on Apple Silicon.☆211Updated last month
- ☆57Updated 10 months ago
- ☆80Updated 7 months ago
- Countdown Game Distill&RL☆47Updated 3 months ago
- The LLM abstraction layer for modern AI agent applications.☆484Updated this week
- Learning records for building a large language model from scratch☆58Updated 11 months ago
- ☆79Updated 3 months ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆34Updated 7 months ago
- Challenges for general-purpose web-browsing AI agents☆67Updated 6 months ago
- An AI agent to control drones from your CLI☆141Updated 4 months ago
- [EMNLP 2025] The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"☆100Updated 3 months ago
- A list of useful Open Source tools and scrapers to gather data for LLMs☆244Updated 9 months ago
- SkillWeaver is a framework to enable web agent self-improvement through environment exploration and skill synthesis.☆100Updated 7 months ago
- a collection of resources around LLMs, aggregated for the workshop "Mastering LLMs: End-to-End Fine-Tuning and Deployment" by Dan Becker …☆110Updated last year
- Completed research on semantic retrieval augmented generation through novel semantic similarity graph traversal algorithms.☆255Updated last month
- ☆49Updated 10 months ago
- Model Activity Visualiser☆519Updated 8 months ago
- Wanna breeze through some papers?☆65Updated last month
- Code and data for the paper "Why think step by step? Reasoning emerges from the locality of experience"☆62Updated 8 months ago