Codebase for Paper Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs
☆22Apr 24, 2025Updated last year
Alternatives and similar repositories for embedding-based-llm-alignment
Users that are interested in embedding-based-llm-alignment are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…☆72Apr 2, 2025Updated last year
- ☆26Oct 26, 2020Updated 5 years ago
- Code for NeurIPS 2022 paper Exploiting Reward Shifting in Value-Based Deep RL☆29Oct 29, 2023Updated 2 years ago
- Simulation and power analysis of panel/hierarchical data that allows for independently generating effects by cross-section (between-subje…☆18May 14, 2025Updated last year
- Based on the R1-Zero method, using rule-based rewards and GRPO on the Code Contests dataset.☆18Apr 22, 2025Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Code for EMNLP'24 paper - On Diversified Preferences of Large Language Model Alignment☆16Aug 6, 2024Updated last year
- [AAAI'25] CharacterBench: Benchmarking Character Customization of Large Language Models☆22Aug 1, 2025Updated 9 months ago
- Source code for Interpretable Reward Redistribution in Reinforcement Learning: A Causal Approach (NeurIPS 2023)☆10Dec 12, 2023Updated 2 years ago
- Github Repo for ICML 2022 paper: Communication-Efficient Adaptive Federated Learning☆10Nov 18, 2022Updated 3 years ago
- Official Implementation of "Learning to Refuse: Towards Mitigating Privacy Risks in LLMs"☆10Dec 13, 2024Updated last year
- Face hashing using neural networks, mapping images to Hamming codes.☆10Dec 21, 2018Updated 7 years ago
- The implementation for the work "Unconstrained Monotonic Calibration of Predictions in Deep Ranking Systems".☆23Jun 11, 2025Updated 11 months ago
- These are experiments for examining reproducibility in Policy Gradient RL algorithms in Continuous domains. Mainly using the Rllab implem…☆17Sep 20, 2017Updated 8 years ago
- ☆42Nov 8, 2025Updated 6 months ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Distributed Feedback-Looped Networks☆10Jan 15, 2020Updated 6 years ago
- The first real-world FL benchmark for legal NLP☆13Nov 29, 2023Updated 2 years ago
- PyTorch implementation of Swap-VAE: A self-supervised approach for generating neural activity☆13Nov 17, 2021Updated 4 years ago
- ☆15Feb 18, 2021Updated 5 years ago
- Code for [NeurIPS'2019 Spotlight] Policy Continuation with Hindsight Inverse Dynamics☆15Jan 7, 2020Updated 6 years ago
- Code for the paper: Dense Reward for Free in Reinforcement Learning from Human Feedback (ICML 2024) by Alex J. Chan, Hao Sun, Samuel Holt…☆38Aug 11, 2024Updated last year
- Code for SyncTwin: Treatment Effect Estimation with Longitudinal Outcomes (NeurIPS 2021)☆12Nov 30, 2021Updated 4 years ago
- A public repo for ICML 2021 "Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks"☆13Jul 19, 2021Updated 4 years ago
- ☆13Feb 24, 2026Updated 3 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Official implementation of Neural Episodic Control with State Abstraction☆13Aug 3, 2023Updated 2 years ago
- uncertainty-guided matting on ICML2023☆12Aug 3, 2023Updated 2 years ago
- [EMNLP 2025] Dataset and Code of "PersonaGym: Evaluating Persona Agents and LLMs"☆42Aug 21, 2025Updated 9 months ago
- hdnet - Hopfield denoising network☆14Oct 6, 2022Updated 3 years ago
- LSST data management: instrument signature removal (detrending) for astronomical images☆10Updated this week
- L4: Practical loss-based stepsize adaptation for PyTorch