awjuliani / web-rl-playgroundLinks
An interactive web-based demonstration of fundamental tabular Reinforcement Learning (RL) algorithms in a simple grid world environment.
☆83Updated 5 months ago
Alternatives and similar repositories for web-rl-playground
Users that are interested in web-rl-playground are comparing it to the libraries listed below
Sorting:
- Fetch arxiv data to LLM-friendly text☆126Updated 8 months ago
- A Deep Research agent from scratch☆212Updated 6 months ago
- ☆169Updated last year
- ☆77Updated 2 months ago
- Completed research on semantic retrieval augmented generation through novel semantic similarity graph traversal algorithms.☆236Updated last week
- RLHF (Supervised fine-tuning, reward model, and PPO) step-by-step in 3 Jupyter notebooks☆213Updated 5 months ago
- A fully functional and simple Machine Learning library made entirely from scratch with Python.☆319Updated last week
- ☆55Updated last year
- ☆80Updated 7 months ago
- A pure MLX-based training pipeline for fine-tuning LLMs using GRPO on Apple Silicon.☆204Updated 3 weeks ago
- 📰 Building News Agents to Summarize News with MCP, Q, and tmux☆307Updated 4 months ago
- AlphaXIV open-source alternative: Chat with any arXiv paper.☆93Updated 5 months ago
- ☆261Updated 2 weeks ago
- a collection of resources around LLMs, aggregated for the workshop "Mastering LLMs: End-to-End Fine-Tuning and Deployment" by Dan Becker …☆110Updated last year
- ☆57Updated 9 months ago
- A list of useful Open Source tools and scrapers to gather data for LLMs☆243Updated 8 months ago
- Open-source autonomous cleaning & housekeeping robot☆237Updated 3 months ago
- SkillWeaver is a framework to enable web agent self-improvement through environment exploration and skill synthesis.☆100Updated 7 months ago
- The LLM abstraction layer for modern AI agent applications.☆376Updated last week
- An AI agent to control drones from your CLI☆138Updated 3 months ago
- A transformer-based multimodal model for music.☆29Updated last year
- Countdown Game Distill&RL☆47Updated 2 months ago
- Repo housing the open sourced code for the ai2 scholar qa app and also the corresponding library☆236Updated this week
- Curated resources for discovering, reading, and working with arXiv papers☆356Updated 5 months ago
- ☆138Updated 3 months ago
- ☆49Updated 9 months ago
- Auto Thinking Mode switch for Qwen3 in Open webui☆69Updated 6 months ago
- A simple tool that let's you explore different possible paths that an LLM might sample.☆192Updated 6 months ago
- Learning records for building a large language model from scratch☆58Updated 10 months ago
- LLM-as-SERP☆70Updated 8 months ago