meaningalignment / dftLinks
Democratic Fine-tuning with a Moral Graph
☆22Updated 9 months ago
Alternatives and similar repositories for dft
Users that are interested in dft are comparing it to the libraries listed below
Sorting:
- A Loom implementation in Obsidian☆302Updated 4 months ago
- Add some intelligence to your notes with Silicon AI for Obsidian☆129Updated last year
- Command-line recursive question-answering with immutable contexts and explicit data store☆26Updated 6 years ago
- Factored Cognition Primer: How to write compositional language model programs☆49Updated 2 years ago
- Public repo for my book Symphony of Thought: Orchestrating Artificial Cognition☆113Updated 2 years ago
- Analyzing and scoring reasoning traces of LLMs☆46Updated 11 months ago
- TextReducer - A Tool for Summarization and Information Extraction☆88Updated last year
- ☆53Updated last year
- AI Safety Q&A web frontend☆40Updated last week
- Code for the paper: "Large Language Models as Corporate Lobbyists" (2023).☆171Updated 2 years ago
- Enable decision-making based on simulations☆227Updated last year
- The AI assistant for Obsidian that helps you write better and think more clearly☆67Updated 2 years ago
- A dataset of alignment research and code to reproduce it☆77Updated 2 years ago
- GPT-based Conversation Summarizer☆148Updated 2 years ago
- Interactive Composition Explorer: a debugger for compositional language model programs☆554Updated last month
- Multiversal tree writing interface for human-AI collaboration☆1,280Updated last year
- Automatically create Anki cards from text using language models☆15Updated 2 years ago
- Hosted embedding platform to discover, evaluate, and retrieve embeddings☆73Updated last year
- Keeping language models honest by directly eliciting knowledge encoded in their activations.☆209Updated this week
- This repository explains and provides examples for "concept anchoring" in GPT4.☆71Updated last year
- Problem solving by engaging multiple AI agents in conversation with each other and the user.☆223Updated last year
- The fastest way to make and track predictions☆50Updated 2 months ago
- Edu-ConvoKit: An Open-Source Framework for Education Conversation Data☆98Updated 3 months ago
- [Added T5 support to TRLX] A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)☆47Updated 2 years ago
- Work in progress! I don't recommend looking at the code right now.☆23Updated 2 months ago
- An awesome list of AnthropicAI' Claude model☆52Updated 2 years ago
- Global Alignment Taxonomy Omnibus☆28Updated 2 years ago
- An alternative frontend for LessWrong 2.0☆75Updated 2 weeks ago
- Test suite for LLM prompts☆51Updated last year
- ☆40Updated 2 years ago