awjuliani / web-rl-playgroundLinks
An interactive web-based demonstration of fundamental tabular Reinforcement Learning (RL) algorithms in a simple grid world environment.
☆90Updated 6 months ago
Alternatives and similar repositories for web-rl-playground
Users that are interested in web-rl-playground are comparing it to the libraries listed below
Sorting:
- Fetch arxiv data to LLM-friendly text☆128Updated 3 weeks ago
- ☆168Updated last year
- Curated resources for discovering, reading, and working with arXiv papers☆378Updated 6 months ago
- A Deep Research agent from scratch☆214Updated 7 months ago
- RLHF (Supervised fine-tuning, reward model, and PPO) step-by-step in 3 Jupyter notebooks☆225Updated 6 months ago
- ☆79Updated 4 months ago
- AlphaXIV open-source alternative: Chat with any arXiv paper.☆105Updated 7 months ago
- 📰 Building News Agents to Summarize News with MCP, Q, and tmux☆307Updated 5 months ago
- ☆245Updated last month
- An AI-powered interface for exploring and understanding arXiv research papers☆239Updated last week
- Mission intent compiler and autonomy supervisor for unmanned systems.☆144Updated 2 weeks ago
- SkillWeaver is a framework to enable web agent self-improvement through environment exploration and skill synthesis.☆103Updated 8 months ago
- A transformer-based multimodal model for music.☆29Updated last year
- ☆55Updated last year
- ☆80Updated 8 months ago
- A simple tool that let's you explore different possible paths that an LLM might sample.☆197Updated 7 months ago
- ☆91Updated 2 months ago
- Learning records for building a large language model from scratch☆58Updated last year
- This is a survey of research on AI scientists, AI researchers, AI engineers, and a series of AI-driven research studies☆165Updated 2 months ago
- ☆49Updated 10 months ago
- ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations [COLM 2025]☆246Updated 5 months ago
- A pure MLX-based training pipeline for fine-tuning LLMs using GRPO on Apple Silicon.☆222Updated 2 months ago
- a collection of resources around LLMs, aggregated for the workshop "Mastering LLMs: End-to-End Fine-Tuning and Deployment" by Dan Becker …☆110Updated last year
- ☆57Updated 10 months ago
- A list of useful Open Source tools and scrapers to gather data for LLMs☆245Updated 10 months ago
- Code and data for the paper "Why think step by step? Reasoning emerges from the locality of experience"☆63Updated 8 months ago
- Model Activity Visualiser☆519Updated 8 months ago
- A fully functional and simple Machine Learning library made entirely from scratch with Python.☆325Updated last month
- Completed research on semantic retrieval augmented generation through novel semantic similarity graph traversal algorithms.☆265Updated last month
- The LLM abstraction layer for modern AI agent applications.☆500Updated last week