awjuliani / web-rl-playgroundLinks
An interactive web-based demonstration of fundamental tabular Reinforcement Learning (RL) algorithms in a simple grid world environment.
☆71Updated 2 months ago
Alternatives and similar repositories for web-rl-playground
Users that are interested in web-rl-playground are comparing it to the libraries listed below
Sorting:
- Fetch arxiv data to LLM-friendly text☆123Updated 5 months ago
- A Deep Research agent from scratch☆201Updated 2 months ago
- ☆166Updated last year
- RLHF (Supervised fine-tuning, reward model, and PPO) step-by-step in 3 Jupyter notebooks☆183Updated last month
- Learning records for building a large language model from scratch☆54Updated 7 months ago
- A fully functional and simple Machine Learning library made entirely from scratch with Python.☆295Updated this week
- A transformer-based multimodal model for music.☆29Updated 11 months ago
- ☆77Updated 3 months ago
- Repo housing the open sourced code for the ai2 scholar qa app and also the corresponding library☆206Updated last week
- Chrome / Edge extension to turn arXiv papers into Markdown codes in one click.☆79Updated 4 months ago
- ☆54Updated 8 months ago
- SkillWeaver is a framework to enable web agent self-improvement through environment exploration and skill synthesis.☆91Updated 3 months ago
- A simple tool that let's you explore different possible paths that an LLM might sample.☆180Updated 3 months ago
- Countdown Game Distill&RL☆46Updated 3 months ago
- ☆206Updated 6 months ago
- CodeScientist: An automated scientific discovery system for code-based experiments☆288Updated last month
- ☆48Updated 6 months ago
- 📰 Building News Agents to Summarize News with MCP, Q, and tmux☆293Updated 2 weeks ago
- ☆57Updated 5 months ago
- LLM-as-SERP☆68Updated 5 months ago
- A list of useful Open Source tools and scrapers to gather data for LLMs☆237Updated 5 months ago
- Commit0: Library Generation from Scratch☆161Updated 3 months ago
- The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"☆86Updated 2 weeks ago
- Python Implementation of MUVERA (Multi-Vector Retrieval via Fixed Dimensional Encodings)☆278Updated last month
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆34Updated 3 months ago
- An AI agent to control drones from your CLI☆122Updated last week
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆94Updated last week
- https://no-ocr.com/about☆163Updated last month
- support BM25+vecetor☆29Updated 2 months ago
- Using APPL to reimplement popular algorithms for Large Language Models (LLMs) and prompts☆45Updated 6 months ago