Official Code Release for "Training a Generally Curious Agent"
☆47May 18, 2025Updated last year
Alternatives and similar repositories for paprika
Users that are interested in paprika are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Learning from preferences is a common paradigm for fine-tuning language models. Yet, many algorithmic design decisions come into play. Ou…☆32Apr 20, 2024Updated 2 years ago
- GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators☆59Dec 23, 2025Updated 5 months ago
- ☆16Feb 22, 2025Updated last year
- Official repo for Offline RL for Online RL☆18Oct 14, 2023Updated 2 years ago
- Measuring General Intelligence With Generated Games (Preprint)☆25Jul 30, 2025Updated 10 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Source code for SWIFT, an efficient reward model.☆21Jan 13, 2026Updated 4 months ago
- ReCross: Unsupervised Cross-Task Generalization via Retrieval Augmentation☆23May 1, 2022Updated 4 years ago
- Martingale posterior neural networks for fast sequential decision making @ Neurips 2025☆25Nov 13, 2025Updated 6 months ago
- A Claude Code hook plugin for IP-based access control · 防 Claude 封号 · Claude IP 检测 · IP 地理位置拦截 · Claude 账号保护☆105Apr 1, 2026Updated last month
- Official Implementation of UTrice: Unifying Primitives in Differentiable Ray Tracing and Rasterization via Triangles for Particle-Based 3…☆30Jan 13, 2026Updated 4 months ago
- Code repository for the paper on "Predicting the Performance of Black-Box LLMs through Self-Queries".☆12Jan 9, 2025Updated last year
- [COLM 2025] Official code for "When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoni…☆15Oct 31, 2025Updated 6 months ago
- Code to accompany the paper "The Information Geometry of Unsupervised Reinforcement Learning"☆20Oct 6, 2021Updated 4 years ago
- [ICLR 2025] ELICIT: LLM Augmentation Via External In-context Capability☆14Mar 11, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- CLARA: Confidence of Labels and Raters☆10Jun 3, 2023Updated 2 years ago
- Official Repo for SvS: A Self-play with Variational Problem Synthesis strategy for RLVR training☆54Dec 13, 2025Updated 5 months ago
- IPHYRE: Interactive Physical Reasoning, ICLR 2024☆18Oct 18, 2024Updated last year
- ☆11Oct 2, 2023Updated 2 years ago
- ☆10Nov 23, 2020Updated 5 years ago
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆42Apr 4, 2025Updated last year
- [ICLR 2026] SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs☆59May 20, 2026Updated last week
- [NeurIPS 2025] Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"☆165Mar 2, 2026Updated 2 months ago
- ☆28Dec 1, 2021Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Prompt-R1: Collaborative Automatic Prompting Framework via End-to-end Reinforcement Learning☆60Feb 24, 2026Updated 3 months ago
- [NeurIPS 2025] First SFT, Second RL, Third UPT: Continual Improving Multi-Modal LLM Reasoning via Unsupervised Post-Training☆86Oct 29, 2025Updated 7 months ago
- Short RL☆18Apr 16, 2026Updated last month
- Code for Representation Bending Paper☆17Jul 15, 2025Updated 10 months ago
- Analyzing LLM Alignment via Token distribution shift☆18Jan 26, 2024Updated 2 years ago
- Code for the main RoboTutor app. Many sound and image assets not included.☆14Nov 5, 2019Updated 6 years ago
- This repository contains the code and pre-trained models for our paper☆24Jun 29, 2025Updated 11 months ago
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆268May 5, 2025Updated last year
- This repository provides the official implementation of QSVD, a method for efficient low-rank approximation that unifies Query-Key-Value …☆26May 16, 2026Updated last week
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code for "RSQ: Learning from Important Tokens Leads to Better Quantized LLMs"☆21Mar 25, 2026Updated 2 months ago
- ☆17Nov 7, 2024Updated last year
- A framework for hosting and scaling AI agents.☆42Nov 25, 2024Updated last year
- Enemies for your LLM☆36Jan 20, 2026Updated 4 months ago
- This is the repository that introduces research topics related to protecting intellectual property (IP) of AI from a data-centric perspec…☆23Oct 30, 2023Updated 2 years ago
- [ICLR 2024] Trajectory-as-Exemplar Prompting with Memory for Computer Control☆69Jan 7, 2026Updated 4 months ago
- ☆15Apr 26, 2025Updated last year