Jiuzhouh / Uncertainty-Aware-Language-AgentLinks
This is the official repo for Towards Uncertainty-Aware Language Agent.
โ30Updated last year
Alternatives and similar repositories for Uncertainty-Aware-Language-Agent
Users that are interested in Uncertainty-Aware-Language-Agent are comparing it to the libraries listed below
Sorting:
- [๐๐๐๐๐ ๐ ๐ข๐ง๐๐ข๐ง๐ ๐ฌ ๐๐๐๐ & ๐๐๐ ๐๐๐๐ ๐๐๐๐๐ ๐๐ซ๐๐ฅ] ๐๐ฏ๐ฉ๐ข๐ฏ๐ค๐ช๐ฏ๐จ ๐๐ข๐ต๐ฉ๐ฆ๐ฎ๐ข๐ต๐ช๐ค๐ข๐ญ ๐๐ฆ๐ข๐ด๐ฐ๐ฏ๐ช๐ฏโฆโ51Updated last year
- โ46Updated last year
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervisionโ124Updated last year
- In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)โ63Updated last year
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewardsโ44Updated 7 months ago
- โ103Updated last year
- Restore safety in fine-tuned language models through task arithmeticโ29Updated last year
- This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversityโ47Updated last year
- GenRM-CoT: Data release for verification rationalesโ66Updated last year
- โ29Updated last year
- This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or reโฆโ37Updated last year
- โ21Updated 3 months ago
- A curated list of resources on Reinforcement Learning with Verifiable Rewards (RLVR) and the reasoning capability boundary of Large Languโฆโ81Updated last month
- [ICLR 2025] Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimizationโ31Updated 10 months ago
- Directional Preference Alignmentโ58Updated last year
- Code for "Reasoning to Learn from Latent Thoughts"โ122Updated 8 months ago
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"โ54Updated last year
- Self-Supervised Alignment with Mutual Informationโ21Updated last year
- โ57Updated 6 months ago
- Online Adaptation of Language Models with a Memory of Amortized Contexts (NeurIPS 2024)โ70Updated last year
- code repo for ICLR 2024 paper "Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs"โ137Updated last year
- โ102Updated 2 years ago
- [NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptorsโ82Updated 11 months ago
- From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.โ23Updated last month
- Reproduction of "RLCD Reinforcement Learning from Contrast Distillation for Language Model Alignmentโ69Updated 2 years ago
- Lightweight Adapting for Black-Box Large Language Modelsโ24Updated last year
- B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasonersโ86Updated 6 months ago
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)โ123Updated last year
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)โ63Updated last year
- โ68Updated last year