centerforaisafety / emergent-valuesLinks
Code for "Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs"
☆54Updated 5 months ago
Alternatives and similar repositories for emergent-values
Users that are interested in emergent-values are comparing it to the libraries listed below
Sorting:
- ☆76Updated this week
- Analysis code for paper "SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks"☆45Updated this week
- Functional Benchmarks and the Reasoning Gap☆88Updated 10 months ago
- Code and data for the paper "Why think step by step? Reasoning emerges from the locality of experience"☆61Updated 4 months ago
- Code for ExploreTom☆84Updated last month
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆72Updated 4 months ago
- A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.☆166Updated this week
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆82Updated this week
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆91Updated 6 months ago
- ☆43Updated 9 months ago
- A library for benchmarking the Long Term Memory and Continual learning capabilities of LLM based agents. With all the tests and code you…☆76Updated 7 months ago
- Persona Vectors: Monitoring and Controlling Character Traits in Language Models☆135Updated last week
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆59Updated 8 months ago
- accompanying material for sleep-time compute paper☆102Updated 3 months ago
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆87Updated 10 months ago
- Train your own SOTA deductive reasoning model☆104Updated 5 months ago
- Source code for the collaborative reasoner research project at Meta FAIR.☆100Updated 3 months ago
- ☆71Updated last week
- Evaluating LLMs with fewer examples☆160Updated last year
- Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.☆45Updated 3 months ago
- ☆53Updated 9 months ago
- Open source interpretability artefacts for R1.☆157Updated 3 months ago
- An automated tool for discovering insights from research papaer corpora☆138Updated last year
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆60Updated 11 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆173Updated 6 months ago
- ☆95Updated 3 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆176Updated 5 months ago
- Simple examples using Argilla tools to build AI☆53Updated 8 months ago
- ☆54Updated last month
- Synthetic data generation and benchmark implementation for "Episodic Memories Generation and Evaluation Benchmark for Large Language Mode…☆50Updated 3 months ago