hanningzhang / ER-PRMView external linksLinks
☆20Dec 14, 2024Updated last year
Alternatives and similar repositories for ER-PRM
Users that are interested in ER-PRM are comparing it to the libraries listed below
Sorting:
- [arxiv: 2512.19673] Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies☆59Feb 6, 2026Updated last week
- Docs of NLP/deep Learning/machine learning, etc. https://siat-nlp.github.io/docs☆11Jul 17, 2019Updated 6 years ago
- [TMLR] Triple Preference Optimization☆30Feb 19, 2025Updated 11 months ago
- The attention map viewer for LLaMA models.☆37Dec 16, 2023Updated 2 years ago
- 桂林电子科技大学Evolution战队2021雷达站视觉代码开源☆12Sep 3, 2021Updated 4 years ago
- ☆13Oct 11, 2024Updated last year
- Code and models for EMNLP 2024 paper "WPO: Enhancing RLHF with Weighted Preference Optimization"☆41Sep 24, 2024Updated last year
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta☆13Nov 11, 2024Updated last year
- [ICML'25] MELON: Provable Defense Against Indirect Prompt Injection Attacks in AI Agents☆20Jul 31, 2025Updated 6 months ago
- Code for "ParaGuide: Guided Diffusion Paraphrasers for Plug-and-Play Textual Style Transfer"☆15Jul 17, 2024Updated last year
- ☆13Jun 25, 2025Updated 7 months ago
- [NeurIPS 2024 poster] Cross-model Control: Improving Multiple Large Language Models in One-time Training☆14Oct 25, 2024Updated last year
- This is the source code for Efficient Sequential Recommendation for Long Term User Interest Via Personalization.☆19Nov 18, 2025Updated 2 months ago
- ☆12Jul 25, 2023Updated 2 years ago
- ☆11Oct 12, 2021Updated 4 years ago
- An approximate implementation of the OpenAI paper - An Empirical Model of Large-Batch Training for MNIST☆11Nov 19, 2022Updated 3 years ago
- INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness☆14Nov 10, 2025Updated 3 months ago
- LLMGeo: Benchmarking Large Language Models on Image Geolocation In-the-wild☆16Oct 31, 2024Updated last year
- Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…☆15Oct 16, 2023Updated 2 years ago
- An official implementation of Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards☆36Oct 3, 2025Updated 4 months ago
- This repository has been redirected into https://kuaisar.github.io/.☆11Oct 12, 2023Updated 2 years ago
- ☆10Apr 15, 2023Updated 2 years ago
- ☆10Jan 21, 2020Updated 6 years ago
- [ICLR 2025] Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization☆12Jan 26, 2025Updated last year
- ☆14Feb 20, 2025Updated 11 months ago
- Birdiebot Target Prception And Decision Making Framework☆13Aug 29, 2022Updated 3 years ago
- Training diffusion model with CIFAR10 dataset(insight from 13 papers)☆15Aug 5, 2025Updated 6 months ago
- ICML 2024 - Official Repository for EXO: Towards Efficient Exact Optimization of Language Model Alignment☆57Jun 16, 2024Updated last year
- Lightblue LLM Eval Framework: tengu, elyza100, ja-mtbench, rakuda☆18Jan 6, 2026Updated last month
- ☆26Jan 5, 2026Updated last month
- Codebase for Math Neurosurgery: Isolating LLMs' Math Reasoning Abilities Using Only Forward Passes☆21Jun 15, 2025Updated 7 months ago
- ☆14Mar 1, 2023Updated 2 years ago
- ☆14Nov 14, 2023Updated 2 years ago
- ☆17Dec 11, 2024Updated last year
- [NeurIPS 2024] "AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment" by Yonggan Fu, Zhongzhi Yu,…☆18Dec 13, 2024Updated last year
- Tasks and tutorials using Graphore's IPU with Hugging Face. Originally at https://github.com/gradient-ai/Graphcore-HuggingFace☆17Mar 12, 2024Updated last year
- This is the repo for our work “An Extensible Plug-and-Play Method for Multi-Aspect Controllable Text Generation” (ACL 2023).☆14Jul 23, 2023Updated 2 years ago
- Math-VR Benchmark & CodePlot-CoT: Mathematical Visual Reasoning by Thinking with Code-Driven Images☆52Nov 4, 2025Updated 3 months ago
- Dataset and Evaluation Code for the K-QA Benchmark.☆18May 26, 2024Updated last year