vicgalle / zero-shot-reward-modelsView external linksLinks
ZYN: Zero-Shot Reward Models with Yes-No Questions
☆35Aug 15, 2023Updated 2 years ago
Alternatives and similar repositories for zero-shot-reward-models
Users that are interested in zero-shot-reward-models are comparing it to the libraries listed below
Sorting:
- Synthetic Data Generation with Execution-Based Verification and Grounding for LLM Training.☆19Feb 7, 2025Updated last year
- A repository for transformer critique learning and generation☆89Dec 7, 2023Updated 2 years ago
- Attentional Neural Network that translates text to phones.☆11Jan 25, 2018Updated 8 years ago
- ☆14Aug 15, 2024Updated last year
- ☆15Oct 26, 2021Updated 4 years ago
- K12高中数学试题数据集☆15Aug 16, 2023Updated 2 years ago
- ☆39Aug 9, 2022Updated 3 years ago
- Code for Arxiv 2023: Improving Language Model Negociation with Self-Play and In-Context Learning from AI Feedback☆207May 24, 2023Updated 2 years ago
- ☆26May 30, 2023Updated 2 years ago
- Official code release of AAAI 2024 paper SayCanPay.☆53Oct 22, 2025Updated 3 months ago
- ☆25Aug 23, 2024Updated last year
- Thisi is the official code base for paper "Think Before You Act: Decision Transformers with Internal Working Memory"☆23Jul 12, 2024Updated last year
- Code for ACL2024 paper - Adversarial Preference Optimization (APO).☆56Jun 3, 2024Updated last year
- MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain Conversation☆28Apr 18, 2024Updated last year
- ☆29May 8, 2024Updated last year
- ☆118May 26, 2025Updated 8 months ago
- ☆31Mar 23, 2024Updated last year
- This is the implementation for the paper "LARGE LANGUAGE MODEL CASCADES WITH MIX- TURE OF THOUGHT REPRESENTATIONS FOR COST- EFFICIENT REA…☆29Jun 1, 2024Updated last year
- Big Data Analysis of Tinder done at Universitat Rovira i Virgili and Universitat Politècnica de Catalunya · BarcelonaTech☆13Jan 3, 2023Updated 3 years ago
- The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism☆30Jul 17, 2024Updated last year
- A tool to paste Excel ranges to Reddit☆11Sep 20, 2025Updated 4 months ago
- [NAACL 2024] A Synthetic, Scalable and Systematic Evaluation Suite for Large Language Models☆33Jun 10, 2024Updated last year
- ☆42Nov 13, 2024Updated last year
- A tool that boosts chatgpt to its maximum potential☆40May 5, 2023Updated 2 years ago
- Continual Resilient (CoRe) Optimizer for PyTorch☆11Jun 10, 2024Updated last year
- Comparative Study and Implementation of Five Factor Model and Myers-Briggs Type Indicator Model☆11Sep 28, 2023Updated 2 years ago
- Multiprocessing in python☆10Aug 20, 2021Updated 4 years ago
- Public repository to host our Checker IP written in SVA that is ported to run on open-source Verilator.☆12Mar 31, 2023Updated 2 years ago
- Simple next-token-prediction for RLHF☆229Sep 30, 2023Updated 2 years ago
- ☆12Jun 26, 2020Updated 5 years ago
- TOD-Flow: Modeling the Structure of Task-Oriented Dialogues☆13Feb 7, 2024Updated 2 years ago
- ProxyExplainer for Graph Neural Networks☆15Oct 24, 2024Updated last year
- Dataset and codes for SEntFiN☆10May 31, 2023Updated 2 years ago
- ☆11Nov 8, 2023Updated 2 years ago
- ☆10Oct 11, 2022Updated 3 years ago
- Training and testing code from our CVPR 2023 paper "Are Deep Neural Networks SMARTer than Second Graders?"☆11Aug 10, 2023Updated 2 years ago
- Evaluation Pipeline for medical tasks.☆12Updated this week
- Code for the papers "Induction of Subgoal Automata for Reinforcement Learning" (AAAI-20) and "Induction and Exploitation of Subgoal Autom…☆13Aug 15, 2023Updated 2 years ago
- Project Gold ✨☆11Jan 29, 2026Updated 2 weeks ago