☆14Dec 16, 2024Updated last year
Alternatives and similar repositories for MRPO
Users that are interested in MRPO are comparing it to the libraries listed below
Sorting:
- Episodic Policy Gradient Training☆17Mar 1, 2022Updated 4 years ago
- Source code for Stable Hadamard Memory☆24May 6, 2025Updated 10 months ago
- Implementation of "Decoding-time Realignment of Language Models", ICML 2024.☆21Jun 17, 2024Updated last year
- Code for the paper "Spectral Editing of Activations for Large Language Model Alignments"☆29Dec 20, 2024Updated last year
- The official code release for Don't Trust: Verify -- Grounding LLM Quantitative Reasoning with Autoformalization☆34Mar 9, 2025Updated last year
- ☆38Apr 8, 2023Updated 2 years ago
- ☆46Apr 10, 2023Updated 2 years ago
- ☆39Aug 9, 2022Updated 3 years ago
- AdaRFT: Efficient Reinforcement Finetuning via Adaptive Curriculum Learning☆53Jun 13, 2025Updated 8 months ago
- Test-time-training on nearest neighbors for large language models☆49Apr 18, 2024Updated last year
- Code for our paper: "GrIPS: Gradient-free, Edit-based Instruction Search for Prompting Large Language Models"☆57Apr 23, 2023Updated 2 years ago
- Rewarded soups official implementation☆62Sep 27, 2023Updated 2 years ago
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆66Oct 18, 2024Updated last year
- Algebraic value editing in pretrained language models☆69Nov 1, 2023Updated 2 years ago
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆66Dec 10, 2024Updated last year
- Released code for our ICLR23 paper.☆66Mar 23, 2023Updated 2 years ago
- ☆74Apr 13, 2025Updated 10 months ago
- Code for the ICML 2024 paper "Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment"☆79Jun 10, 2025Updated 9 months ago
- Offical code of the paper Large Language Models Are Implicitly Topic Models: Explaining and Finding Good Demonstrations for In-Context Le…☆75Mar 20, 2024Updated last year
- ☆78Oct 5, 2023Updated 2 years ago
- We view Large Language Models as stochastic language layers in a network, where the learnable parameters are the natural language prompts…☆95Jul 25, 2024Updated last year
- PyTorch implementation of "ChatTime: A Unified Multimodal Time Series Foundation Model Bridging Numerical and Textual Data" (AAAI 2025 [o…☆148Jul 15, 2025Updated 7 months ago
- A resource repository for representation engineering in large language models☆148Nov 14, 2024Updated last year
- ☆208Dec 20, 2024Updated last year
- Code for the paper "Efficient Training of Language Models to Fill in the Middle"☆200Apr 2, 2023Updated 2 years ago
- End-To-End Task-Completion Dialogue Challenge☆194Jun 20, 2019Updated 6 years ago
- Official code from the paper "Offline RL for Natural Language Generation with Implicit Language Q Learning"☆211Jul 31, 2023Updated 2 years ago
- Official implementation for "AutoTimes: Autoregressive Time Series Forecasters via Large Language Models"☆258Jul 22, 2025Updated 7 months ago
- Multi-Objective Reinforcement Learning☆296Aug 10, 2021Updated 4 years ago
- ☆356May 17, 2024Updated last year
- Few-shot Learning of GPT-3☆357Sep 18, 2023Updated 2 years ago
- Code and data for "Lost in the Middle: How Language Models Use Long Contexts"☆374Jan 4, 2024Updated 2 years ago
- A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic…☆417Apr 13, 2025Updated 10 months ago
- MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models☆454Feb 1, 2024Updated 2 years ago
- RewardBench: the first evaluation tool for reward models.☆702Feb 16, 2026Updated 3 weeks ago
- Summary of deep learning models for dialog systems (Tiancheng Zhao LTI, CMU)☆643Jul 8, 2020Updated 5 years ago
- JAX (Flax) implementation of algorithms for Deep Reinforcement Learning with continuous action spaces.☆753Oct 26, 2022Updated 3 years ago
- [ICLR 2025 Spotlight] Official implementation of "Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts"☆916Dec 12, 2025Updated 2 months ago
- Contriever: Unsupervised Dense Information Retrieval with Contrastive Learning☆772Apr 7, 2023Updated 2 years ago