openai / automated-interpretability
☆998Updated last year
Alternatives and similar repositories for automated-interpretability:
Users that are interested in automated-interpretability are comparing it to the libraries listed below
- Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"☆1,707Updated last year
- 800,000 step-level correctness labels on LLM solutions to MATH problems☆1,952Updated last year
- ☆1,028Updated last year
- PaL: Program-Aided Language Models (ICML 2023)☆484Updated last year
- Dromedary: towards helpful, ethical and reliable LLMs.☆1,142Updated last year
- [ACL2023] We introduce LLM-Blender, an innovative ensembling framework to attain consistently superior performance by leveraging the dive…☆922Updated 5 months ago
- A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.☆801Updated 8 months ago
- Implementation of Toolformer, Language Models That Can Use Tools, by MetaAI☆2,013Updated 8 months ago
- ☆1,507Updated last week
- [NeurIPS 22] [AAAI 24] Recurrent Transformer-based long-context architecture.☆760Updated 4 months ago
- Codes for "Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models".☆1,122Updated last year
- Code for fine-tuning Platypus fam LLMs using LoRA☆628Updated last year
- Data and code for NeurIPS 2022 Paper "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering".☆644Updated 6 months ago
- Representation Engineering: A Top-Down Approach to AI Transparency☆807Updated 7 months ago
- Alpaca dataset from Stanford, cleaned and curated☆1,544Updated last year
- LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions☆820Updated last year
- OpenICL is an open-source framework to facilitate research, development, and prototyping of in-context learning.☆552Updated last year
- The hub for EleutherAI's work on interpretability and learning dynamics☆2,423Updated last week
- Mass-editing thousands of facts into a transformer memory (ICLR 2023)☆470Updated last year
- TruthfulQA: Measuring How Models Imitate Human Falsehoods☆696Updated 2 months ago
- [NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333☆1,095Updated last year
- ChatArena (or Chat Arena) is a Multi-Agent Language Game Environments for LLMs. The goal is to develop communication and collaboration ca…☆1,437Updated 9 months ago
- Salesforce open-source LLMs with 8k sequence length.☆717Updated last month
- ☆444Updated last year
- ☆1,226Updated last year
- ☆740Updated 9 months ago
- Locating and editing factual associations in GPT (NeurIPS 2022)☆609Updated 11 months ago
- A collection of open-source dataset to train instruction-following LLMs (ChatGPT,LLaMA,Alpaca)☆1,112Updated last year
- Benchmarking large language models' complex reasoning ability with chain-of-thought prompting☆2,694Updated 7 months ago
- ☆525Updated last year