saffronh / ccaiLinks
Data processing for the Collective Constitutional AI project (a collaboration between The Collective Intelligence Project & Anthropic)
☆24Updated 2 years ago
Alternatives and similar repositories for ccai
Users that are interested in ccai are comparing it to the libraries listed below
Sorting:
- Just a bunch of benchmark logs for different LLMs☆118Updated last year
- Functional Benchmarks and the Reasoning Gap☆89Updated last year
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆63Updated 10 months ago
- ☆86Updated last year
- Official code for the paper "ADaPT: As-Needed Decomposition and Planning with Language Models"☆88Updated last year
- ☆135Updated 7 months ago
- A library for benchmarking the Long Term Memory and Continual learning capabilities of LLM based agents. With all the tests and code you…☆79Updated 10 months ago
- A repository of projects and datasets under active development by Alignment Lab AI☆22Updated last year
- Interaction-first method for generating demonstrations for web-agents on any website☆51Updated 6 months ago
- Mixing Language Models with Self-Verification and Meta-Verification☆109Updated 10 months ago
- ☆41Updated last year
- Gentopia Agent Zoo and Agent Benchmark☆31Updated 2 years ago
- Public Inflection Benchmarks☆68Updated last year
- LLM boxing matches☆58Updated last year
- A set of utilities for running few-shot prompting experiments on large-language models☆125Updated 2 years ago
- Multimodal computer agent data collection program☆151Updated last year
- ☆172Updated last week
- Official repo for Learning to Reason for Long-Form Story Generation☆72Updated 6 months ago
- A framework for evaluating function calls made by LLMs☆40Updated last year
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆188Updated 7 months ago
- ☆55Updated 4 months ago
- An automated tool for discovering insights from research papaer corpora☆138Updated last year
- Evaluating LLMs with fewer examples☆164Updated last year
- Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"☆311Updated last year
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆105Updated last month
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles☆59Updated 5 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆172Updated 9 months ago
- MiniHF is an inference, human preference data collection, and fine-tuning tool for local language models. It is intended to help the user…☆181Updated 3 weeks ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆88Updated this week
- ☆14Updated 6 months ago