UW-Madison-Lee-Lab / VersaPRM
โ20Updated 2 months ago
Alternatives and similar repositories for VersaPRM:
Users that are interested in VersaPRM are comparing it to the libraries listed below
- Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"โ74Updated 10 months ago
- [๐๐๐๐๐ ๐ ๐ข๐ง๐๐ข๐ง๐ ๐ฌ ๐๐๐๐ & ๐๐๐ ๐๐๐๐ ๐๐๐๐๐ ๐๐ซ๐๐ฅ] ๐๐ฏ๐ฉ๐ข๐ฏ๐ค๐ช๐ฏ๐จ ๐๐ข๐ต๐ฉ๐ฆ๐ฎ๐ข๐ต๐ช๐ค๐ข๐ญ ๐๐ฆ๐ข๐ด๐ฐ๐ฏ๐ช๐ฏโฆโ49Updated 11 months ago
- Pytorch implementation for "Compressed Context Memory For Online Language Model Interaction" (ICLR'24)โ57Updated last year
- Sotopia-ฯ: Interactive Learning of Socially Intelligent Language Agents (ACL 2024)โ62Updated 11 months ago
- โ44Updated last month
- Official Repo for Paper "Optimizing Temperature for Language Models with Multi-Sample Inference"โ16Updated 2 months ago
- This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or reโฆโ29Updated 7 months ago
- The official implementation of Self-Exploring Language Models (SELM)โ63Updated 10 months ago
- Directional Preference Alignmentโ57Updated 7 months ago
- Code repository for the paper - "Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass"โ17Updated 8 months ago
- โ17Updated 3 months ago
- Code for "Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes"โ27Updated last year
- Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"โ53Updated 3 weeks ago
- Codebase for Instruction Following without Instruction Tuningโ34Updated 7 months ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"โ38Updated last year
- Unofficial Implementation of Chain-of-Thought Reasoning Without Promptingโ32Updated last year
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correctionโ68Updated last month
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimizationโ34Updated last month
- Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Mergingโ99Updated last year
- PyTorch implementation of StableMask (ICML'24)โ12Updated 9 months ago
- [ACL 2024] Code for "MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation"โ35Updated 9 months ago
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"โ58Updated last year
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewardsโ43Updated last week
- Repository for the paper: 500xCompressor: Generalized Prompt Compression for Large Language Modelsโ33Updated 8 months ago
- [EMNLP 2023]Context Compression for Auto-regressive Transformers with Sentinel Tokensโ24Updated last year
- โ18Updated 11 months ago
- [EMNLP 2023, Findings] GRACE: Discriminator-Guided Chain-of-Thought Reasoningโ47Updated 6 months ago
- โ14Updated 4 months ago
- Large Language Models Can Self-Improve in Long-context Reasoningโ68Updated 5 months ago
- Long Context Extension and Generalization in LLMsโ53Updated 7 months ago