[ICLR 2026 Blogpost Track Poster] JustRL: Scaling a 1.5B LLM with a Simple RL Recipe
☆258Mar 11, 2026Updated last month
Alternatives and similar repositories for JustRL
Users that are interested in JustRL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NeurIPS 2025] Reasoning Models Better Express Their Confidence"☆23Nov 19, 2025Updated 4 months ago
- Source code for the ACL'2025 paper titled "Unveiling privacy risks in llm agent memory"☆28Dec 2, 2025Updated 4 months ago
- Github Repository for the HOI4 ULTRA Project.☆11Updated this week
- [CVPR2026] VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice☆81Feb 27, 2026Updated last month
- A PyTorch native library for large model training☆25Apr 1, 2026Updated last week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- DICE: Detecting In-distribution Data Contamination with LLM's Internal State☆11Sep 21, 2024Updated last year
- [NDSS 2026] Official repo for Odysseus: Jailbreaking Commercial Multimodal LLM-integrated Systems via Dual Steganography☆32Mar 14, 2026Updated 3 weeks ago
- ☆46Jun 24, 2025Updated 9 months ago
- Code for paper Almost-Orthogonal Layers for Efficient General-Purpose Lipschitz Networks☆13Aug 9, 2022Updated 3 years ago
- C++ implementation for 《"GrabCut" — Interactive Foreground Extraction using Iterated Graph Cuts》☆12Jul 25, 2023Updated 2 years ago
- ☆32Oct 22, 2025Updated 5 months ago
- Official implementation of "PyVision-RL: Forging Open Agentic Vision Models via RL."☆89Feb 25, 2026Updated last month
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".☆30Nov 12, 2024Updated last year
- ☆11Feb 24, 2025Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆11Mar 11, 2025Updated last year
- [NeurIPS 2025] Official code for "Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms"☆27Oct 23, 2025Updated 5 months ago
- [ICLR2026] "Co-rewarding: Stable Self-supervised RL for Eliciting Reasoning in Large Language Models"☆30Feb 4, 2026Updated 2 months ago
- Combining SOAP and MUON☆19Feb 11, 2025Updated last year
- PyTorch code for our paper "Progressive Binarization with Semi-Structured Pruning for LLMs"☆13Mar 11, 2026Updated last month
- PeRL: Parameter-Efficient Reinforcement Learning☆74Updated this week
- GeoZarr extension for OpenLayers☆12Jun 27, 2024Updated last year
- RWKV-7 mini☆12Mar 29, 2025Updated last year
- [ICLR 2026] Quantile Advantage Estimation for Entropy-Safe Reasoning☆24Oct 14, 2025Updated 5 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A single-line modification to any (dualizer-based) optimizer that allows the optimizer to adapt to the scale of the gradients as they cha…☆19Jan 11, 2025Updated last year
- The official implementation of our ECCV 2024 publication, PYRA (Parallel Yielding Re-Activation).☆22Dec 19, 2025Updated 3 months ago
- Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"☆38Oct 11, 2024Updated last year
- Qualifying Exam Preparing☆17May 7, 2025Updated 11 months ago
- SMART introduces a novel test-time framework where Small Language Models (SLMs) reason step-by-step, and Large Language Models (LLMs) pro…☆11Jul 9, 2025Updated 9 months ago
- ☆24Dec 11, 2024Updated last year
- ☆44Oct 12, 2025Updated 6 months ago
- THU Methematics for Engineering Master Candidates.(清华大学工程硕士数学课程)☆11Nov 21, 2021Updated 4 years ago
- ☆23Mar 18, 2024Updated 2 years ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- The official implementation of the paper "Data Contamination Calibration for Black-box LLMs" (ACL 2024)☆16May 21, 2024Updated last year
- ☆26Mar 30, 2026Updated 2 weeks ago
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆67Dec 10, 2024Updated last year
- Simple repository for training small reasoning models☆49Feb 17, 2026Updated last month
- Website for HKU NLP group (under construction)☆14Mar 20, 2026Updated 3 weeks ago
- [WACV 2026] SceneEdited: A City-Scale Benchmark for 3D HD Map Updating via Image-Guided Change Detection☆16Mar 22, 2026Updated 3 weeks ago
- Make reasoning models scalable☆49May 31, 2025Updated 10 months ago