apple / ml-entity-deduction-arenaLinks
☆34Updated last year
Alternatives and similar repositories for ml-entity-deduction-arena
Users that are interested in ml-entity-deduction-arena are comparing it to the libraries listed below
Sorting:
- ☆56Updated last year
 - Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind☆68Updated last year
 - Whispering Experts: Neural Interventions for Toxicity Mitigation in Language Models, ICML 2024☆22Updated last year
 - Official implementation of FIND (NeurIPS '23) Function Interpretation Benchmark and Automated Interpretability Agents☆51Updated last year
 - PyTorch library for Active Fine-Tuning☆93Updated last month
 - [𝐄𝐌𝐍𝐋𝐏 𝐅𝐢𝐧𝐝𝐢𝐧𝐠𝐬 𝟐𝟎𝟐𝟒 & 𝐀𝐂𝐋 𝟐𝟎𝟐𝟒 𝐍𝐋𝐑𝐒𝐄 𝐎𝐫𝐚𝐥] 𝘌𝘯𝘩𝘢𝘯𝘤𝘪𝘯𝘨 𝘔𝘢𝘵𝘩𝘦𝘮𝘢𝘵𝘪𝘤𝘢𝘭 𝘙𝘦𝘢𝘴𝘰𝘯𝘪𝘯…☆51Updated last year
 - ☆100Updated last year
 - Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆60Updated last year
 - ☆41Updated last year
 - Language models scale reliably with over-training and on downstream tasks☆100Updated last year
 - ☆29Updated 2 weeks ago
 - 👻 Code and benchmark for our EMNLP 2023 paper - "FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions"☆56Updated last year
 - Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆52Updated 2 years ago
 - Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆118Updated last year
 - Official implementation of "BERTs are Generative In-Context Learners"☆32Updated 7 months ago
 - Skill-It! A Data-Driven Skills Framework for Understanding and Training Language Models☆47Updated 2 years ago
 - ☆43Updated 3 years ago
 - PASTA: Post-hoc Attention Steering for LLMs☆126Updated 11 months ago
 - A mechanistic approach for understanding and detecting factual errors of large language models.☆46Updated last year
 - Online Adaptation of Language Models with a Memory of Amortized Contexts (NeurIPS 2024)☆69Updated last year
 - Code accompanying the paper Pretraining Language Models with Human Preferences☆180Updated last year
 - ☆103Updated last year
 - RL algorithm: Advantage induced policy alignment☆65Updated 2 years ago
 - ☆27Updated last year
 - ☆128Updated last year
 - A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models☆65Updated 8 months ago
 - SILO Language Models code repository☆83Updated last year
 - Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆71Updated last year
 - Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch☆89Updated last year
 - This repository includes code for the paper "Does Localization Inform Editing? Surprising Differences in Where Knowledge Is Stored vs. Ca…☆61Updated 2 years ago