haizelabs / thorn-in-haizestack
Thorn in a HaizeStack test for evaluating long-context adversarial robustness.
☆26Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for thorn-in-haizestack
- Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.☆44Updated 5 months ago
- ☆101Updated 3 months ago
- Sphynx Hallucination Induction☆48Updated 3 months ago
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆41Updated last month
- Small, simple agent task environments for training and evaluation☆16Updated 2 weeks ago
- Repository for the paper Stream of Search: Learning to Search in Language☆93Updated 3 months ago
- utilities for loading and running text embeddings with onnx☆39Updated 3 months ago
- Red-Teaming Language Models with DSPy☆142Updated 7 months ago
- Turing machines, Rule 110, and A::B reversal using Claude 3 Opus.☆60Updated 6 months ago
- A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.☆86Updated 5 months ago
- RepoQA: Evaluating Long-Context Code Understanding☆100Updated 2 weeks ago
- LLMs as Collaboratively Edited Knowledge Bases☆43Updated 9 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆128Updated 3 weeks ago
- Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.☆22Updated last month
- ☆41Updated 3 weeks ago
- ☆20Updated 2 weeks ago
- Functional Benchmarks and the Reasoning Gap☆78Updated last month
- The repository contains code for Adaptive Data Optimization☆18Updated last month
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆62Updated 5 months ago
- gzip Predicts Data-dependent Scaling Laws☆32Updated 5 months ago
- Alice in Wonderland code base for experiments and raw experiments data☆109Updated last month
- Code for RATIONALYST: Pre-training Process-Supervision for Improving Reasoning https://arxiv.org/pdf/2410.01044☆30Updated last month
- A trace analysis tool for AI agents.☆124Updated last month
- Evaluating LLMs with CommonGen-Lite☆85Updated 8 months ago
- Benchmark evaluating LLMs on their ability to create and resist disinformation. Includes comprehensive testing across major models (Claud…☆14Updated 3 weeks ago
- Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement☆46Updated 3 weeks ago
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).☆77Updated 8 months ago
- A toolkit for describing model features and intervening on those features to steer behavior.☆106Updated last week
- Replicating O1 inference-time scaling laws☆49Updated last month
- Experiments for efforts to train a new and improved t5☆76Updated 7 months ago