Official codebase for "Quantile Reward Policy Optimization: Alignment with Pointwise Regression and Exact Partition Functions" (Matrenok et al. 2025).
☆30Dec 8, 2025Updated 2 months ago
Alternatives and similar repositories for quantile-reward-policy-optimization
Users that are interested in quantile-reward-policy-optimization are comparing it to the libraries listed below
Sorting:
- ☆20May 25, 2024Updated last year
- ☆17Updated this week
- ☆13Oct 5, 2025Updated 4 months ago
- Attribution-based Parameter Decomposition☆34Jun 11, 2025Updated 8 months ago
- Build an AI bot in Discord to serve user's personalized reports on what's up in tech☆28Sep 14, 2025Updated 5 months ago
- my profile readme☆14Updated this week
- ☆16Feb 22, 2025Updated last year
- PyTorch Quantization Framework For OCP MX Datatypes.☆16May 30, 2025Updated 9 months ago
- ☆12Jul 8, 2024Updated last year
- Reference implementation of Thin and Deep Gaussian Processes (NeurIPS 2023)☆14Nov 25, 2024Updated last year
- Implementing LRP (Layer-wise Relevance Propagation) for a sequence-to-sequence model with GRU layers.☆12Sep 8, 2023Updated 2 years ago
- ☆12Jan 10, 2023Updated 3 years ago
- Integrating neurosymbolic representations into LLMs for interpretability, steering, and running symbolic algorithms☆14Feb 2, 2026Updated last month
- Generate Quiz Question from PDF/Text files☆11Feb 2, 2024Updated 2 years ago
- Linear Relational Embeddings (LREs) and Linear Relational Concepts (LRCs) for LLMs in PyTorch☆10Aug 7, 2024Updated last year
- Code for experiments on self-prediction as a way to measure introspection in LLMs☆16Dec 10, 2024Updated last year
- Debug DeepSpeed-Chat step by step in IDE (在IDE里一步一步调试DeepSpeed-Chat)☆10Apr 17, 2023Updated 2 years ago
- All-in-One Safety Evaluation Framwork☆41Feb 13, 2026Updated 2 weeks ago
- 🚀 Sliding Window Attention Training for Efficient Large Language Models☆16Dec 8, 2025Updated 2 months ago
- ☆14Aug 5, 2022Updated 3 years ago
- Learning to Skip the Middle Layers of Transformers☆17Aug 7, 2025Updated 6 months ago
- ☆14Aug 7, 2024Updated last year
- Code related to the paper "Asynchronous Batch Bayesian Optimisation with Improved Local Penalisation"☆13May 8, 2019Updated 6 years ago
- Langchain + Docker + Neo4j☆10Oct 29, 2024Updated last year
- The course work repo for UoSurrey EEEM071 (2023 Spring)☆11May 9, 2023Updated 2 years ago
- [ICML-2025] We introduce Lie group Relative position Encodings (LieRE) that goes beyond RoPE in supporting n-dimensional inputs.☆14Aug 8, 2025Updated 6 months ago
- Deepseek-CoT☆10Oct 6, 2024Updated last year
- DeepSAVA: Sparse Adversarial Video Attacks with Spatial Transformations - BMVC 2021 & Neural Networks (2023)☆11Dec 13, 2024Updated last year
- Minimal Transformer base in JAX. A single backbone for language modelling, diffusion, classification, etc...☆14May 28, 2025Updated 9 months ago
- Serverless AI Inference with Gemma 2 using Mozilla's llamafile on AWS Lambda☆11Jul 30, 2024Updated last year
- ☆11Mar 13, 2023Updated 2 years ago
- Python package for compressing floating-point PyTorch tensors☆13Jul 22, 2024Updated last year
- 🤖 Implementation of Self Normalizing Networks (SNN) in PyTorch.☆12Jun 19, 2017Updated 8 years ago
- A Java-based framework for combinatorial test input generation, fault characterization and automated test execution.☆11Jan 22, 2024Updated 2 years ago
- ☆15Apr 26, 2025Updated 10 months ago
- Predicting the Stock Market - Can we do it?☆10Jul 24, 2021Updated 4 years ago
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta☆13Nov 11, 2024Updated last year
- ☆20Jul 23, 2025Updated 7 months ago
- Posterior Refinement Improves Sample Efficiency in Bayesian Neural Networks☆10Oct 21, 2022Updated 3 years ago