haizelabs / bijection-learning
☆19Updated 3 months ago
Alternatives and similar repositories for bijection-learning:
Users that are interested in bijection-learning are comparing it to the libraries listed below
- Sphynx Hallucination Induction☆51Updated 5 months ago
- Red-Teaming Language Models with DSPy☆154Updated 9 months ago
- ☆55Updated this week
- A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.☆88Updated 7 months ago
- ☆48Updated last year
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆120Updated this week
- ☆80Updated 3 weeks ago
- look how they massacred my boy☆63Updated 3 months ago
- ☆118Updated last week
- Just a bunch of benchmark logs for different LLMs☆117Updated 6 months ago
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆69Updated 3 months ago
- Entropy Based Sampling and Parallel CoT Decoding☆17Updated 3 months ago
- Functional Benchmarks and the Reasoning Gap☆82Updated 3 months ago
- ☆48Updated 2 months ago
- ☆37Updated 6 months ago
- Verbosity control for AI agents☆59Updated 8 months ago
- ☆20Updated 2 months ago
- Evaluating LLMs with CommonGen-Lite☆88Updated 10 months ago
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆64Updated 7 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆156Updated 3 months ago
- KMD is a collection of conversational exchanges between patients and doctors on various medical topics. It aims to capture the intricaci…☆24Updated last year
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles☆20Updated last month
- Vivaria is METR's tool for running evaluations and conducting agent elicitation research.☆77Updated this week
- ☆49Updated 4 months ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆82Updated last year
- Small, simple agent task environments for training and evaluation☆18Updated 2 months ago
- Track the progress of LLM context utilisation☆53Updated 6 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆53Updated 5 months ago
- Camel-Coder: Collaborative task completion with multiple agents. Role-based prompts, intervention mechanism, and thoughtful suggestions☆33Updated last year