yikee / ScienceMeterLinks
ScienceMeter: Tracking Scientific Knowledge Updates in Language Models
☆16Updated 3 months ago
Alternatives and similar repositories for ScienceMeter
Users that are interested in ScienceMeter are comparing it to the libraries listed below
Sorting:
- Resolving Knowledge Conflicts in Large Language Models, COLM 2024☆18Updated last week
- CopyBench: Measuring Literal and Non-Literal Reproduction of Copyright-Protected Text in Language Model Generation☆15Updated 2 months ago
- ☆28Updated 7 months ago
- ☆15Updated last year
- AbstainQA, ACL 2024☆28Updated last year
- ☆52Updated 6 months ago
- [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆66Updated 10 months ago
- ☆22Updated 10 months ago
- This repository contains data, code and models for contextual noncompliance.☆24Updated last year
- Stanford NLP Python library for benchmarking the utility of LLM interpretability methods☆136Updated 3 months ago
- ☆49Updated 10 months ago
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆71Updated last year
- Optimize Any User-defined Compound AI Systems☆53Updated 2 months ago
- Models, data, and codes for the paper: MetaAligner: Towards Generalizable Multi-Objective Alignment of Language Models☆24Updated last year
- LoFiT: Localized Fine-tuning on LLM Representations☆41Updated 9 months ago
- ☆29Updated last year
- PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)☆40Updated 11 months ago
- The Prism Alignment Project☆82Updated last year
- [EMNLP 2024] A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners☆25Updated 10 months ago
- ☆92Updated last year
- Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data☆43Updated 8 months ago
- ☆56Updated 2 years ago
- Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators (Liu et al.; COLM 2024)☆49Updated 9 months ago
- A holistic benchmark for LLM abstention☆53Updated last month
- [ACL 2024] <Large Language Models for Automated Open-domain Scientific Hypotheses Discovery>. It has also received the best poster award …☆42Updated 11 months ago
- Few-shot Learning with Auxiliary Data☆31Updated last year
- The official repo for DARG: Dynamic Evaluation of Large Language Models via Adaptive Reasoning Graph☆17Updated last year
- personalized-llms with allen institute☆14Updated 2 years ago
- Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]☆31Updated 8 months ago
- Easy-to-use MIRAGE code for faithful answer attribution in RAG applications. Paper: https://aclanthology.org/2024.emnlp-main.347/☆25Updated 7 months ago