Official repository for Beyond Binary Rewards: Training LMs to Reason about Their Uncertainty
☆54Aug 20, 2025Updated 6 months ago
Alternatives and similar repositories for RLCR
Users that are interested in RLCR are comparing it to the libraries listed below
Sorting:
- Code for ACL 2023 paper "BOLT: Fast Energy-based Controlled Text Generation with Tunable Biases".☆22Sep 7, 2023Updated 2 years ago
- ☆28Sep 13, 2021Updated 4 years ago
- Official repository of "Back to Source: Diffusion-Driven Test-Time Adaptation"☆87Nov 27, 2023Updated 2 years ago
- Introduction to Machine Learning using scikit-learn and PyTorch☆10Sep 26, 2019Updated 6 years ago
- ☆11Jun 18, 2023Updated 2 years ago
- Code for the paper "Semi-Conditional Normalizing Flows for Semi-Supervised Learning"☆11Mar 30, 2020Updated 5 years ago
- Source code to accompany research paper on training multi token prediction language models using self-distillation.☆24Feb 21, 2026Updated 2 weeks ago
- Test-Time Adaptation via Conjugate Pseudo-Labels☆42May 25, 2023Updated 2 years ago
- UQGAN: A Unified Model for Uncertainty Quantification of Deep Classifiers trained via Conditional GANs☆11Apr 13, 2023Updated 2 years ago
- ☆13Feb 14, 2022Updated 4 years ago
- Empowering Small VLMs to Think with Dynamic Memorization and Exploration☆15Nov 18, 2025Updated 3 months ago
- Toolkit in Python for the acquisition, analysis and visualization of motion capture using IMU☆14May 19, 2021Updated 4 years ago
- A list of all papers related to anomaly detection in NeurIPS 2020.☆10Jan 13, 2021Updated 5 years ago
- Code accompanying our ICML 2020 paper on choice set optimization in group decision-making.☆11Jun 27, 2020Updated 5 years ago
- Bayesian scaling laws for in-context learning.☆15Mar 12, 2025Updated 11 months ago
- ☆10Jan 28, 2024Updated 2 years ago
- Prompt-Guided Retrieval For Non-Knowledge-Intensive Tasks☆12Sep 1, 2023Updated 2 years ago
- NOMU: Neural Optimization-based Model Uncertainty☆10Feb 17, 2023Updated 3 years ago
- ☆12Sep 24, 2024Updated last year
- Active Learning Helps Pretrained Models Learn the Intended Task (https://arxiv.org/abs/2204.08491) by Alex Tamkin, Dat Nguyen, Salil Desh…☆11Nov 22, 2022Updated 3 years ago
- ☆13Jul 14, 2025Updated 7 months ago
- [Re] Can gradient clipping mitigate label noise? (ML Reproducibility Challenge 2020)☆14Sep 3, 2024Updated last year
- [ICML 2025] Official repository for paper "OR-Bench: An Over-Refusal Benchmark for Large Language Models"☆23Mar 4, 2025Updated last year
- ☆11Jul 1, 2021Updated 4 years ago
- Code release for Unsupervised Domain Adaptation via Distilled Discriminative Clustering published by Pattern Recognition in 2022☆11May 19, 2023Updated 2 years ago
- Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding.☆13Nov 19, 2024Updated last year
- ☆15Mar 2, 2025Updated last year
- Code for https://arxiv.org/abs/1811.00145☆12Feb 13, 2021Updated 5 years ago
- Repository for the NeurIPS 2023 paper "Beyond Confidence: Reliable Models Should Also Consider Atypicality"☆13Apr 21, 2024Updated last year
- [ICLR 2022] "Bayesian Modeling and Uncertainty Quantification for Learning to Optimize: What, Why, and How" by Yuning You, Yue Cao, Tianl…☆14Aug 19, 2022Updated 3 years ago
- implementation of Wasserstein Natural Policy Gradients and Wasserstein Natural Evolution Strategies☆13Mar 9, 2021Updated 4 years ago
- Code for the paper "Local Causal Discovery for Estimating Causal Effects".☆11Apr 9, 2024Updated last year
- ☆11Jan 21, 2021Updated 5 years ago
- Active learning in NLP☆14Dec 14, 2022Updated 3 years ago
- This is the official implementation of TAGCOS: Task-agnostic Gradient Clustered Coreset Selection for Instruction Tuning Data☆13Jul 21, 2024Updated last year
- Code Release for Task Agnostic Dynamics Priors for Deep Reinforcement Learning☆12Jun 13, 2019Updated 6 years ago
- Code for Generalization Guarantees for (Multi-Modal) Imitation Learning☆11Jul 14, 2022Updated 3 years ago
- Simple PyTorch profiler that combines DeepSpeed Flops Profiler and TorchInfo☆11Feb 12, 2023Updated 3 years ago
- A Google Colab for DFDNet: Blind Face Restoration☆12Aug 9, 2021Updated 4 years ago