Code and data for NAACL 2025 paper "IHEval: Evaluating Language Models on Following the Instruction Hierarchy"
☆17Feb 25, 2025Updated last year
Alternatives and similar repositories for IHEval
Users that are interested in IHEval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Inferring Strange Behavior from Connectivity Pattern (PAKDD 2014, KAIS 2015)☆11Mar 27, 2015Updated 11 years ago
- Python package for InfoAlign☆13Oct 14, 2024Updated last year
- Code for EMNLP 2024 paper "Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning"☆54Oct 1, 2024Updated last year
- This repository is unmaintained, please see lumo for details.☆10Mar 19, 2023Updated 3 years ago
- Code of paper: xJailbreak: Representation Space Guided Reinforcement Learning for Interpretable LLM Jailbreaking"☆18Apr 3, 2026Updated 2 weeks ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Resources for Retrieval Augmentation for Commonsense Reasoning: A Unified Approach. EMNLP 2022.☆24Nov 23, 2022Updated 3 years ago
- ☆20Jun 16, 2025Updated 10 months ago
- Catching Synchronized Behavior in Large Directed Graphs (KDD 2014)☆22Mar 27, 2015Updated 11 years ago
- Modified CartPole-v0 OpenAI Gym environment with various noisy cases and Reinforcement Learning based controller☆10Dec 5, 2017Updated 8 years ago
- Multi-Critic Policy Gradient Optimization for Quadcopter Coordination☆14Aug 10, 2021Updated 4 years ago
- ☆14Aug 15, 2024Updated last year
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆29Feb 17, 2025Updated last year
- Author: Wenhao Yu (wyu1@nd.edu). ACL 2022. Commonsense Reasoning on Knowledge Graph for Text Generation☆55Jan 30, 2023Updated 3 years ago
- Multi-Agent Reinforcement Learning☆11Jun 16, 2020Updated 5 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity☆22Aug 28, 2025Updated 7 months ago
- [NeurIPS'23] Source code of "Data-Centric Learning from Unlabeled Graphs with Diffusion Model": A data-centric transfer learning framewor…☆22Jun 4, 2025Updated 10 months ago
- ☆13Jan 14, 2026Updated 3 months ago
- ☆27Jun 5, 2024Updated last year
- ☆36Jun 10, 2024Updated last year
- Mixture of Expert (MoE) techniques for enhancing LLM performance through expert-driven prompt mapping and adapter combinations.☆12Feb 11, 2024Updated 2 years ago
- Beyond Myopia: Learning from Positive and Unlabeled Data through Holistic Predictive Trends [NeurIPS 2023]☆10Jan 28, 2024Updated 2 years ago
- Trial Reasoner for AI that Learns☆18Sep 17, 2025Updated 7 months ago
- Compare how fine-tuned AI video models interpret the same prompts☆14Jan 29, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- OmniByteFormer is a generalized Transformer model that can process any type of data by converting it into byte sequences, bypassing tradi…☆15Updated this week
- source code for ICLR'24 paper "How does unlabeled data provably help OOD detection?"☆13Feb 1, 2024Updated 2 years ago
- ☆31Oct 23, 2024Updated last year
- ☆27Sep 15, 2025Updated 7 months ago
- Official implementation for “HarmonyGuard: Toward Safety and Utility in Web Agents via Adaptive Policy Enhancement and Dual-Objective Opt…☆25Jan 10, 2026Updated 3 months ago
- A self-hosted version of WaterCrawl, a powerful web crawling and data extraction platform.☆13Jul 27, 2025Updated 8 months ago
- A list of large language models for user modeling (LLM-UM) papers, based on "User Modeling in the Era of Large Language Models: Current R…☆151Apr 8, 2024Updated 2 years ago
- FaithScore: Fine-grained Evaluations of Hallucinations in Large Vision-Language Models☆33Nov 27, 2025Updated 4 months ago
- [CVPR 2024] Targeted Representation Alignment for Open-World Semi-Supervised Learning☆14Sep 23, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Pytorch Implementation of LoG 22 [Oral] -- Transductive Linear Probing: A Novel Framework for Few-Shot Node Classification☆17May 31, 2023Updated 2 years ago
- A program repair tool which modifies any bugged Python script based on cues from rest of program.☆20Jun 14, 2021Updated 4 years ago
- [NAACL 2025 Main] Official Implementation of MLLMU-Bench☆51Mar 13, 2025Updated last year
- foundational data structures and algorithms for time- oriented data in Visual Analytics☆26May 13, 2019Updated 6 years ago
- [ICLR 2025 SSI-FM] Self-Taught Self-Correction for Small Language Models☆11Sep 19, 2025Updated 7 months ago
- This is the pytorch implementation of the long paper on ACL 2020: A Self-Training Method for Machine Reading Comprehension with Soft Evid…☆34Aug 14, 2020Updated 5 years ago
- A better Alpaca Model Trained with Less Data (only 9k instructions of the original set)☆24Jul 26, 2024Updated last year