saffronh / ccaiLinks
Data processing for the Collective Constitutional AI project (a collaboration between The Collective Intelligence Project & Anthropic)
☆26Updated 2 years ago
Alternatives and similar repositories for ccai
Users that are interested in ccai are comparing it to the libraries listed below
Sorting:
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆66Updated last year
- Track the progress of LLM context utilisation☆55Updated 8 months ago
- Mixing Language Models with Self-Verification and Meta-Verification☆110Updated last year
- Gentopia Agent Zoo and Agent Benchmark☆31Updated 2 years ago
- Functional Benchmarks and the Reasoning Gap☆90Updated last year
- Official repo for Learning to Reason for Long-Form Story Generation☆72Updated 7 months ago
- Just a bunch of benchmark logs for different LLMs☆119Updated last year
- A library for benchmarking the Long Term Memory and Continual learning capabilities of LLM based agents. With all the tests and code you…☆81Updated 11 months ago
- A set of utilities for running few-shot prompting experiments on large-language models☆126Updated 2 years ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆34Updated 7 months ago
- accompanying material for sleep-time compute paper☆118Updated 7 months ago
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆239Updated this week
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆189Updated 9 months ago
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles☆61Updated 7 months ago
- A repository of projects and datasets under active development by Alignment Lab AI☆22Updated last year
- Exploration using DSPy to optimize modules to maximize performance on the OpenToM dataset☆23Updated last year
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Updated 2 years ago
- Public Inflection Benchmarks☆68Updated last year
- never forget anything again! combine AI and intelligent tooling for a local knowledge base to track catalogue, annotate, and plan for you…☆37Updated last year
- Multimodal computer agent data collection program☆152Updated last week
- ☆89Updated last month
- Open Source Replication of Anthropic's Alignment Faking Paper☆52Updated 8 months ago
- Source code for our paper: "SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals".☆70Updated last year
- Run SWE-bench evaluations remotely☆44Updated 4 months ago
- Official Repo for CRMArena and CRMArena-Pro☆126Updated last month
- 🔔🧠 Easily experiment with popular language agents across diverse reasoning/decision-making benchmarks!☆54Updated 5 months ago
- Automatic Prompt Optimization☆48Updated last year
- Code for "Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs"☆85Updated 9 months ago
- Evaluating tool-augmented LLMs in conversation settings☆88Updated last year
- ☆86Updated 2 years ago