haizelabs / bijection-learning
☆22Updated 6 months ago
Alternatives and similar repositories for bijection-learning
Users that are interested in bijection-learning are comparing it to the libraries listed below
Sorting:
- Sphynx Hallucination Induction☆54Updated 3 months ago
- Red-Teaming Language Models with DSPy☆192Updated 3 months ago
- ☆74Updated 3 weeks ago
- Official repo for Learning to Reason for Long-Form Story Generation☆51Updated 3 weeks ago
- ☆46Updated this week
- ☆54Updated 7 months ago
- Verdict is a library for scaling judge-time compute.☆211Updated 2 weeks ago
- Open source interpretability artefacts for R1.☆109Updated 3 weeks ago
- ☆129Updated last month
- Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".☆102Updated last year
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆80Updated 7 months ago
- ☆125Updated last month
- ⚖️ Awesome LLM Judges ⚖️☆97Updated 2 weeks ago
- ☆50Updated 5 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆173Updated 2 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆65Updated last month
- ☆56Updated last week
- Functional Benchmarks and the Reasoning Gap☆86Updated 7 months ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆32Updated last month
- ☆48Updated last year
- Just a bunch of benchmark logs for different LLMs☆119Updated 9 months ago
- ☆22Updated this week
- A library for benchmarking the Long Term Memory and Continual learning capabilities of LLM based agents. With all the tests and code you…☆70Updated 5 months ago
- Vivaria is METR's tool for running evaluations and conducting agent elicitation research.☆92Updated last week
- ☆92Updated 2 months ago
- A subset of jailbreaks automatically discovered by the Haize Labs haizing suite.☆91Updated last month
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆172Updated 4 months ago
- Train your own SOTA deductive reasoning model☆92Updated 2 months ago
- ☆81Updated 4 months ago
- A better way of testing, inspecting, and analyzing AI Agent traces.☆35Updated last week