code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning
☆44Mar 20, 2024Updated 2 years ago
Alternatives and similar repositories for Prompt-OIRL
Users that are interested in Prompt-OIRL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Oct 11, 2023Updated 2 years ago
- ☆28Oct 28, 2024Updated last year
- Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision☆19Apr 1, 2025Updated 11 months ago
- ☆12Aug 21, 2020Updated 5 years ago
- The code for the paper "A Bayesian Approach to Online Planning" published in ICML 2024.☆13Jun 17, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆10Jan 28, 2024Updated 2 years ago
- [2025-TMLR] A Survey on the Honesty of Large Language Models☆64Dec 8, 2024Updated last year
- This is AI implementation (not official) of the DreamGym framework from the paper "Scaling Agent Learning via Experience Synthesis" (arXi…☆39Nov 9, 2025Updated 4 months ago
- Repository of IPBench☆20Jan 4, 2026Updated 2 months ago
- Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models☆41Sep 30, 2024Updated last year
- ☆15Nov 19, 2021Updated 4 years ago
- ☆40Nov 13, 2025Updated 4 months ago
- Single-Life Reinforcement Learning☆14Dec 17, 2022Updated 3 years ago
- LongAttn :Selecting Long-context Training Data via Token-level Attention☆15Jul 16, 2025Updated 8 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆33Jan 15, 2026Updated 2 months ago
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Feb 22, 2024Updated 2 years ago
- ☆14Mar 5, 2024Updated 2 years ago
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Feb 23, 2024Updated 2 years ago
- Marathon: A Multiple-choice Long Context Evaluation Benchmark for Large Language Models.☆10May 16, 2024Updated last year
- Unofficial implementation of Chain of Hindsight (https://arxiv.org/abs/2302.02676) using pytorch and huggingface Trainers.☆11Apr 5, 2023Updated 2 years ago
- Format your bibtex (.bib) file to help standardize citations for conference and journal submissions☆14Nov 23, 2025Updated 4 months ago
- A Jekyll theme based on tufte-css, in the style of Edward Tufte☆20Dec 9, 2022Updated 3 years ago
- n awesome&curated list of the advanced graph data-centric (i.e., graph sparsification, graph denoise, graph condensation) learning papers☆17Jun 9, 2025Updated 9 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- PACIFIC: Towards Proactive Conversational Question Answering over Tabular and Textual Data in Finance☆14May 15, 2024Updated last year
- Implementation of AdaCQR(COLING 2025)☆13Dec 30, 2024Updated last year
- (ICLR 2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.☆51Jul 1, 2025Updated 8 months ago
- SOTA work about out-of-distribution detection☆14Mar 5, 2021Updated 5 years ago
- Data and code for Emotion Prediction Errors☆10Feb 22, 2022Updated 4 years ago
- "A Discrete Variational Recurrent Topic Model without the Reparametrization Trick" (NeurIPS 2020)☆11Apr 26, 2021Updated 4 years ago
- official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and …☆72Apr 2, 2025Updated 11 months ago
- Code for the paper "Inference via Interpolation: Contrastive Representations Provably Enable Planning and Inference"☆43Jul 10, 2024Updated last year
- ☆14Jan 4, 2025Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Source code for paper "Generative Flow Network for Listwise Recommendation"☆17Nov 8, 2024Updated last year
- Code implementation of R^2-Guard: Robust Reasoning Enabled LLM Guardrail via Knowledge-Enhanced Logical Reasoning☆22Jul 8, 2024Updated last year
- A beautiful weather visualization Javascript library ☀🌤☁🌧🌨☆17Apr 26, 2021Updated 4 years ago
- ☆16Nov 1, 2023Updated 2 years ago
- ☆13Sep 26, 2024Updated last year
- ☆11Oct 22, 2024Updated last year
- source code for AAMAS 2023 Imperfect-information Card Game Competition☆13Mar 21, 2024Updated 2 years ago