[NeurIPS Spotlight 2025] Angles Don’t Lie: Unlocking Training-Efficient RL Through the Model’s Own Signals.
☆81Sep 26, 2025Updated 5 months ago
Alternatives and similar repositories for GAINRL
Users that are interested in GAINRL are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] Dobi-SVD : Differentiable SVD for LLM Compression and Some New Perspectives"☆50Oct 19, 2025Updated 4 months ago
- [ICLR2026] "Co-rewarding: Stable Self-supervised RL for Eliciting Reasoning in Large Language Models"☆30Feb 4, 2026Updated last month
- ☆33Feb 25, 2026Updated last week
- [ICML2024 Spotlight] Fine-Tuning Pre-trained Large Language Models Sparsely☆24Jun 26, 2024Updated last year
- DeiT implementation for Q-ViT☆25Apr 21, 2025Updated 10 months ago
- Code for the paper Boosting Accuracy and Robustness of Student Models via Adaptive Adversarial Distillation (CVPR 2023).☆34May 26, 2023Updated 2 years ago
- Build an AI bot in Discord to serve user's personalized reports on what's up in tech☆28Sep 14, 2025Updated 5 months ago
- Repository of IPBench☆19Jan 4, 2026Updated 2 months ago
- ☆11Sep 27, 2022Updated 3 years ago
- ☆34Mar 12, 2025Updated 11 months ago
- MemRec☆37Jan 16, 2026Updated last month
- ☆11Mar 31, 2022Updated 3 years ago
- [NeurIPS'23] Binary Classification with Confidence Difference☆10May 13, 2024Updated last year
- [NeurIPS 2025] Official code for "Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms"☆23Oct 23, 2025Updated 4 months ago
- PyData Boston 2013 talks: "Intro to scikit-learn" & "Realtime Predictive Analytics: Using scikit-learn and RabbitMQ"☆11Jan 5, 2014Updated 12 years ago
- OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents☆21Jan 6, 2026Updated 2 months ago
- EduAction is an educational content generation application powered by GenAI developed during the Encode Club AI Hackathon London 2024.☆12Mar 24, 2024Updated last year
- Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization☆81Dec 25, 2025Updated 2 months ago
- Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity☆22Aug 28, 2025Updated 6 months ago
- ☆11Jul 17, 2023Updated 2 years ago
- Embodied-Planner-R1: Unleashing Embodied Task Planning Ability in LLMs via Reinforcement Learning☆25Jan 5, 2026Updated 2 months ago
- ☆11Apr 28, 2024Updated last year
- GBM implementation on Legate☆14Jan 28, 2026Updated last month
- The source code and the data for ACL 2022 paper "Show Me More Details: Discovering Hierarchies of Procedures from Semi-structured Web Dat…☆14Apr 21, 2023Updated 2 years ago
- ☆32Sep 19, 2025Updated 5 months ago
- 🎉 TrustJudge is accepted to ICLR 2026!☆38Sep 27, 2025Updated 5 months ago
- ☆12May 23, 2024Updated last year
- opentqa is a open framework of the textbook question answering, which includes xtqa, mcan, cmr, mfb, mutan.☆11Mar 27, 2021Updated 4 years ago
- Code for our paper: "Building A Coding Assistant via Retrieval-Augmented Language Models"☆10Nov 2, 2024Updated last year
- ☆15Mar 13, 2025Updated 11 months ago
- Faster version of AugShuffleNet without channel shuffle, computes partially, crossovers swiftly☆11Feb 17, 2025Updated last year
- Generating Summaries with Controllable Readability Levels (EMNLP 2023)☆15Aug 6, 2025Updated 7 months ago
- Code for the paper "FinRLlama: A Solution to LLM-Engineered Signals Challenge at FinRL Contest 2024"☆13Feb 14, 2025Updated last year
- Serial monitor in rust☆14Jul 24, 2024Updated last year
- Transformer + GAT for RNA chemical reactivity prediction| Stanford Ribonanza☆11Jan 28, 2026Updated last month
- Official implementation for Text Generation Beyond Discrete Token Sampling☆22Aug 11, 2025Updated 6 months ago
- 🦓 🔗 ZebraChain is a logged, quantum safe signing protocol designed to replace the long lived asymmetric key pairs used to sign software…☆13Dec 28, 2025Updated 2 months ago
- ☆10Feb 24, 2025Updated last year
- Resolves AWS secretmanager secrets from variables that give the secret ARNs and exposes them as plain environment variables☆13Aug 13, 2024Updated last year