RuLES: a benchmark for evaluating rule-following in language models
☆249Feb 24, 2025Updated last year
Alternatives and similar repositories for llm_rules
Users that are interested in llm_rules are comparing it to the libraries listed below
Sorting:
- A benchmark to evaluate language models on questions I've previously asked them to solve.☆1,042Apr 27, 2025Updated 10 months ago
- Fine-tune mistral-7B on 3090s, a100s, h100s☆724Oct 11, 2023Updated 2 years ago
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆48Jan 17, 2024Updated 2 years ago
- Self-Supervised Alignment with Mutual Information☆20May 24, 2024Updated last year
- MetricEval: A framework that conceptualizes and operationalizes four main components of metric evaluation, in terms of reliability and va…☆12Nov 6, 2023Updated 2 years ago
- Fast bare-bones BPE for modern tokenizer training☆176Jun 23, 2025Updated 8 months ago
- Code for LaMPP: Language Models as Probabilistic Priors for Perception and Action☆37Apr 3, 2023Updated 2 years ago
- PAL: Proxy-Guided Black-Box Attack on Large Language Models☆57Aug 17, 2024Updated last year
- ☆76Feb 16, 2024Updated 2 years ago
- JAX implementation ViT-VQGAN☆63Jul 23, 2022Updated 3 years ago
- Latest Evaluation Toolkit (LatestEval). Assessing the language models with latest, uncontaminated materials.☆28Feb 17, 2025Updated last year
- Dataset for the Tensor Trust project☆48Mar 17, 2024Updated last year
- This is the repo for the paper Shepherd -- A Critic for Language Model Generation☆222Aug 10, 2023Updated 2 years ago
- Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"☆316Dec 20, 2023Updated 2 years ago
- Finding trojans in aligned LLMs. Official repository for the competition hosted at SaTML 2024.☆116Jun 13, 2024Updated last year
- 🤫 Code and benchmark for our ICLR 2024 spotlight paper: "Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Con…☆50Dec 20, 2023Updated 2 years ago
- Röttger et al. (NAACL 2024): "XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models"☆129Feb 24, 2025Updated last year
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆124Sep 9, 2024Updated last year
- ☆51Oct 28, 2024Updated last year
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.☆6,185Aug 22, 2025Updated 6 months ago
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆127Mar 22, 2024Updated last year
- Martingale posterior neural networks for fast sequential decision making @ Neurips 2025☆23Nov 13, 2025Updated 3 months ago
- ☆20Nov 4, 2025Updated 4 months ago
- Small, simple agent task environments for training and evaluation☆19Nov 1, 2024Updated last year
- Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)☆63Dec 25, 2023Updated 2 years ago
- ☆22Feb 26, 2024Updated 2 years ago
- ☆23Dec 15, 2022Updated 3 years ago
- Code and data for "Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs"☆474Mar 19, 2024Updated last year
- This repository provides an original implementation of Detecting Pretraining Data from Large Language Models by *Weijia Shi, *Anirudh Aji…☆242Nov 3, 2023Updated 2 years ago
- TOD-Flow: Modeling the Structure of Task-Oriented Dialogues☆13Feb 7, 2024Updated 2 years ago
- ☆251Dec 21, 2022Updated 3 years ago
- Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"☆1,824Jun 17, 2025Updated 8 months ago
- A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)☆3,211Feb 8, 2026Updated last month
- Official implementation of Privacy Implications of Retrieval-Based Language Models (EMNLP 2023). https://arxiv.org/abs/2305.14888☆37Jun 10, 2024Updated last year
- [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling☆952Nov 16, 2025Updated 3 months ago
- Tile primitives for speedy kernels☆3,202Feb 24, 2026Updated last week
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.☆595Aug 12, 2025Updated 6 months ago
- UNet diffusion model in pure CUDA☆657Jun 28, 2024Updated last year
- CodeUltraFeedback: aligning large language models to coding preferences (TOSEM 2025)☆73Jun 25, 2024Updated last year