dvruette / concept-guidanceLinks
Code accompanying the paper "A Language Model's Guide Through Latent Space". It contains functionality for training and using concept vectors that control the behavior of LLMs at inference time.
☆21Updated last year
Alternatives and similar repositories for concept-guidance
Users that are interested in concept-guidance are comparing it to the libraries listed below
Sorting:
- Multi-Domain Expert Learning☆67Updated last year
- Simple GRPO scripts and configurations.☆59Updated 6 months ago
- Official repo for Learning to Reason for Long-Form Story Generation☆68Updated 3 months ago
- ☆37Updated last year
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆71Updated last year
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆32Updated 3 months ago
- ☆119Updated 5 months ago
- ☆38Updated last year
- A repository for transformer critique learning and generation☆90Updated last year
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…☆75Updated 11 months ago
- ☆88Updated 7 months ago
- Measuring the situational awareness of language models☆37Updated last year
- Repository for "I am a Strange Dataset: Metalinguistic Tests for Language Models"☆44Updated last year
- Functional Benchmarks and the Reasoning Gap☆88Updated 10 months ago
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Updated 2 years ago
- Synthetic data generation and benchmark implementation for "Episodic Memories Generation and Evaluation Benchmark for Large Language Mode…☆50Updated 3 months ago
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Updated 2 years ago
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆44Updated last year
- Code repository for the c-BTM paper☆107Updated last year
- ☆46Updated last year
- Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from e…☆28Updated last year
- Memoria is a human-inspired memory architecture for neural networks.☆75Updated 9 months ago
- ☆49Updated last year
- ☆28Updated last year
- ☆63Updated 10 months ago
- ☆44Updated 8 months ago
- ☆54Updated last year
- Repository for the code of the "PPL-MCTS: Constrained Textual Generation Through Discriminator-Guided Decoding" paper, NAACL'22☆66Updated 2 years ago
- Latent Diffusion Language Models☆69Updated last year
- An experiment to see if chatgpt can improve the output of the stanford alpaca dataset☆12Updated 2 years ago