awjuliani / web-rl-playgroundLinks
An interactive web-based demonstration of fundamental tabular Reinforcement Learning (RL) algorithms in a simple grid world environment.
☆76Updated 3 months ago
Alternatives and similar repositories for web-rl-playground
Users that are interested in web-rl-playground are comparing it to the libraries listed below
Sorting:
- Fetch arxiv data to LLM-friendly text☆125Updated 6 months ago
- A Deep Research agent from scratch☆207Updated 4 months ago
- RLHF (Supervised fine-tuning, reward model, and PPO) step-by-step in 3 Jupyter notebooks☆200Updated 2 months ago
- A fully functional and simple Machine Learning library made entirely from scratch with Python.☆297Updated last month
- ☆168Updated last year
- Learning records for building a large language model from scratch☆57Updated 8 months ago
- ☆131Updated last month
- ☆73Updated 3 weeks ago
- 📰 Building News Agents to Summarize News with MCP, Q, and tmux☆300Updated 2 months ago
- Curated resources for discovering, reading, and working with arXiv papers☆340Updated 3 months ago
- A list of useful Open Source tools and scrapers to gather data for LLMs☆239Updated 6 months ago
- A transformer-based multimodal model for music.☆29Updated last year
- AlphaXIV open-source alternative: Chat with any arXiv paper.☆78Updated 3 months ago
- ☆256Updated last month
- Turn topics into essays in seconds!☆187Updated 2 months ago
- ☆55Updated 10 months ago
- An AI agent to control drones from your CLI☆130Updated last month
- Countdown Game Distill&RL☆47Updated 2 weeks ago
- ☆77Updated 5 months ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆33Updated 4 months ago
- Using APPL to reimplement popular algorithms for Large Language Models (LLMs) and prompts☆45Updated 8 months ago
- Chrome / Edge extension to turn arXiv papers into Markdown codes in one click.☆82Updated 6 months ago
- Train a Language Model with GRPO to create a schedule from a list of events and priorities☆231Updated 4 months ago
- Turn local files into a prompt for an LLM☆176Updated 8 months ago
- DeepSearch Code-Actions Agent (DSCA). Build 🙌 with 🤗 smolagents☆117Updated last month
- Python Implementation of MUVERA (Multi-Vector Retrieval via Fixed Dimensional Encodings)☆308Updated 2 months ago
- https://no-ocr.com/about☆164Updated 2 months ago
- ☆48Updated 7 months ago
- Repo housing the open sourced code for the ai2 scholar qa app and also the corresponding library☆226Updated 2 weeks ago
- Convert Everything to PDF☆164Updated 4 months ago