CLAIRE-Labo/quantile-reward-policy-optimization

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/CLAIRE-Labo/quantile-reward-policy-optimization)

CLAIRE-Labo / quantile-reward-policy-optimization

Official codebase for "Quantile Reward Policy Optimization: Alignment with Pointwise Regression and Exact Partition Functions" (Matrenok et al. 2025).

☆30

Alternatives and similar repositories for quantile-reward-policy-optimization

Users that are interested in quantile-reward-policy-optimization are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

CLAIRE-Labo / RAT
View on GitHub
Official code for the NeurIPS25 paper "RAT: Bridging RNN Efficiencyand Attention Accuracy in Language Modeling" (https://arxiv.org/abs/25…
☆26Dec 10, 2025Updated 7 months ago
CLAIRE-Labo / StructuredFFN
View on GitHub
The official code of "Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers"
☆20Jul 24, 2024Updated 2 years ago
jdeschena / sdtt
View on GitHub
[ICLR 2025] SDTT: a simple and effective distillation method for discrete diffusion models
☆52Feb 26, 2026Updated 4 months ago
manuelmlmadeira / DeFoG
View on GitHub
Official implementation of the paper: "DeFoG: Discrete Flow Matching for Graph Generation"
☆69Feb 3, 2026Updated 5 months ago
ylsung / rsq
View on GitHub
Code for "RSQ: Learning from Important Tokens Leads to Better Quantized LLMs"
☆23Mar 25, 2026Updated 4 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
rgreenblatt / control-evaluations
View on GitHub
☆25May 25, 2024Updated 2 years ago
zikuicai / aegisllm
View on GitHub
☆35Feb 17, 2026Updated 5 months ago
JasonGross / guarantees-based-mechanistic-interpretability
View on GitHub
☆18Updated this week
hanqi-qi / LLM_MetaReasoning
View on GitHub
☆15Jul 29, 2025Updated 11 months ago
IBM / activated-lora
View on GitHub
Source code for Activated LoRA
☆26Jul 15, 2026Updated last week
song-wx / SIFT
View on GitHub
[ICML2024 Spotlight] Fine-Tuning Pre-trained Large Language Models Sparsely
☆24Jun 26, 2024Updated 2 years ago
jhejna / morphology-opt
View on GitHub
Code for the paper Task Agnostic Morphology Evolution.
☆21May 25, 2021Updated 5 years ago
jplhughes / dotfiles
View on GitHub
Easily deploy my zsh and tmux configuration on new machines. Includes local and remote aliases to improve workflow.
☆15Apr 23, 2026Updated 3 months ago
David-cripto / IDLM
View on GitHub
(ICML 2026) IDLM: Inverse-distilled Diffusion Language Models
☆16Jun 8, 2026Updated last month
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
DerrickYLJ / LessIsMore
View on GitHub
[ICML 2026] Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning
☆34Sep 12, 2025Updated 10 months ago
vpuri3 / FLARE.py
View on GitHub
Fast Low-rank Attention Routing Engine
☆17Feb 22, 2026Updated 5 months ago
Mia-Cong / SWIFT
View on GitHub
Official implementation of "Can Test-Time Scaling Improve World Foundation Model?"
☆15Jul 12, 2025Updated last year
davidstutz / cvpr2019-adversarial-robustness
View on GitHub
CVPR 2019 paper "Disentangling Adversarial Robustness and Generalization".
☆14Oct 28, 2019Updated 6 years ago
amack315 / unsupervised-steering-vectors
View on GitHub
☆38Apr 30, 2024Updated 2 years ago
diffusion-llms / awesome-discrete-diffusion-models
View on GitHub
☆17Oct 27, 2025Updated 8 months ago
kyegomez / SelfExtend
View on GitHub
Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta
☆13Nov 11, 2024Updated last year
zhuhanqing / Lightening-Transformer-AE
View on GitHub
Artifact evaluation for HPCA'24 paper Lightening-Transformer: A Dynamically-operated Optically-interconnected Photonic Transformer Accele…
☆11Mar 3, 2024Updated 2 years ago
metaskills / llamafile-on-lambda
View on GitHub
Serverless AI Inference with Gemma 2 using Mozilla's llamafile on AWS Lambda
☆11Jul 30, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
rohitgandikota / erasing-llm
View on GitHub
Erasing conceptual knowledge from language models through low-rank fine-tuning
☆23Mar 27, 2025Updated last year
kyegomez / MobileVLM
View on GitHub
Implementation of the LDP module block in PyTorch and Zeta from the paper: "MobileVLM: A Fast, Strong and Open Vision Language Assistant …
☆15Mar 11, 2024Updated 2 years ago
fredhohman / summit-notebooks
View on GitHub
Notebooks for Scaling Deep Learning Interpretability by Visualizing Activation and Attribution Summarizations
☆15Oct 3, 2019Updated 6 years ago
tml-epfl / why-weight-decay
View on GitHub
Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]
☆73Sep 25, 2024Updated last year
Sara-mibo / LRP_EncoderDecoder_GRU
View on GitHub
Implementing LRP (Layer-wise Relevance Propagation) for a sequence-to-sequence model with GRU layers.
☆12Sep 8, 2023Updated 2 years ago
Infini-AI-Lab / Kinetics
View on GitHub
Kinetics: Rethinking Test-Time Scaling Laws
☆87Jul 11, 2025Updated last year
7tl7qns7ch / IPOT
View on GitHub
Inducing Point Operator Transformer: A Flexible and Scalable Architecture for Solving PDEs (AAAI 2024)
☆14Jul 30, 2024Updated last year
Infini-AI-Lab / GRESO
View on GitHub
☆81Jun 8, 2026Updated last month
yuzhaouoe / SAE-based-representation-engineering
View on GitHub
[NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
☆83Jun 20, 2026Updated last month
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
zqOuO / GWT
View on GitHub
☆13May 4, 2026Updated 2 months ago
Trustworthy-ML-Lab / Linear-Explanations
View on GitHub
[ICML 24] A novel automated neuron explanation framework that can accurately describe poly-semantic concepts in deep neural networks
☆14May 2, 2025Updated last year
jart / matmul
View on GitHub
☆18Jul 2, 2024Updated 2 years ago
cpldcpu / LRMTokenEconomy
View on GitHub
Measuring Thinking Efficiency in Reasoning Models - Research Repository
☆39Dec 2, 2025Updated 7 months ago
bertiev / SimpleSafetyTests
View on GitHub
☆19Mar 25, 2024Updated 2 years ago
U-C4N / Deepseek-CoT
View on GitHub
Deepseek-CoT
☆10Oct 6, 2024Updated last year
chenzen94 / debug-deepspeed-chat
View on GitHub
Debug DeepSpeed-Chat step by step in IDE (在IDE里一步一步调试DeepSpeed-Chat)
☆10Apr 17, 2023Updated 3 years ago