Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.
☆67May 5, 2025Updated 10 months ago
Alternatives and similar repositories for calculator_agent_rl
Users that are interested in calculator_agent_rl are comparing it to the libraries listed below
Sorting:
- A Multi-Agentic AI Assistant/Builder☆25Jan 23, 2026Updated last month
- coded with and corrected by Google Anti-Gravity☆13Nov 23, 2025Updated 3 months ago
- This project showcases engaging interactions between two AI chatbots.☆10Jan 10, 2024Updated 2 years ago
- REBUS: A Robust Evaluation Benchmark of Understanding Symbols☆13Aug 13, 2024Updated last year
- QLoRA: Efficient Finetuning of Quantized LLMs☆11Jul 22, 2023Updated 2 years ago
- A toy text-to-image model trained from scratch.☆19Jun 9, 2025Updated 8 months ago
- Modified Beam Search with periodical restart☆12Sep 12, 2024Updated last year
- ☆43Jan 27, 2026Updated last month
- Quick Notebook Tutorials☆36Jul 17, 2025Updated 7 months ago
- My learnings (publicly) on RAG systems☆14Jan 2, 2024Updated 2 years ago
- A semantic caching layer for LLM apps. It’s meant to cut down on repeated API calls even when the user phrases things differently☆14Jul 3, 2025Updated 8 months ago
- Exploring Applications of GRPO☆252Aug 25, 2025Updated 6 months ago
- Our library for RL environments + evals☆3,877Updated this week
- [SDM24] Official code for "Time-Transformer"☆18Sep 30, 2025Updated 5 months ago
- Efficient computer use agent powered by Meta Llama 4 Maverick☆46Apr 17, 2025Updated 10 months ago
- Mixtral finetuning☆19Feb 2, 2024Updated 2 years ago
- Let's create synthetic textbooks together :)☆76Jan 29, 2024Updated 2 years ago
- Exploring limitations of LLM-as-a-judge☆20Aug 17, 2024Updated last year
- ☆20Mar 25, 2025Updated 11 months ago
- Local Ollama with Qdrant RAG: Embed, index, and enhance models for retrieval-augmented generation. Get started with easy setup for powerf…☆25Mar 27, 2024Updated last year
- Low-Rank adapter extraction for fine-tuned transformers models☆180May 2, 2024Updated last year
- JacQues is a Dash-based interactive web application that facilitates real-time chat and document management.☆22Jan 5, 2026Updated 2 months ago
- qwen3 experiments☆34Jul 1, 2025Updated 8 months ago
- A stable, fast and easy-to-use inference library with a focus on a sync-to-async API☆48Sep 26, 2024Updated last year
- ☆160Apr 17, 2025Updated 10 months ago
- Miscellaneous Tutorials☆26Sep 20, 2023Updated 2 years ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆59Oct 18, 2025Updated 4 months ago
- Simple GRPO scripts and configurations.☆59Feb 6, 2025Updated last year
- Framework-Agnostic RL Environments for LLM Fine-Tuning☆44Feb 28, 2026Updated last week
- One Line To Build Zero-Data Classifiers in Minutes☆64Sep 25, 2024Updated last year
- The State Of The Art, intelligence☆157Aug 12, 2025Updated 6 months ago
- A repository containing general tutorials I'd like to share with the world.☆81Dec 1, 2025Updated 3 months ago
- Yaraa (Yet Another Rag Automation Attempt) is a library that tackles the boring aspects of managing Rag pipelines, so you don't have to.☆26Sep 5, 2024Updated last year
- Locally hosted AI Agent Python Tool To Generate Novel Research Hypothesis + Titles + Abstracts☆30Apr 30, 2025Updated 10 months ago
- Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)☆28Jun 28, 2023Updated 2 years ago
- Easily create LLM automation/agent workflows☆60Feb 13, 2024Updated 2 years ago
- MCP-enabled AI conversation engine with MCTS analysis, FastAPI backend, and async operations for building advanced LLM applications☆47Jul 27, 2025Updated 7 months ago
- A Streamlit app for generating high-quality Q&A training datasets from text and PDFs, leveraging Gemini, Claude, and OpenAI for LLM fine-…☆39Jul 5, 2025Updated 8 months ago
- Simple and fast server for GPTQ-quantized LLaMA inference☆24May 18, 2023Updated 2 years ago