jennhu / metalinguistic-prompting
Materials for "Prompting is not a substitute for probability measurements in large language models" (EMNLP 2023)
☆21Updated last year
Alternatives and similar repositories for metalinguistic-prompting:
Users that are interested in metalinguistic-prompting are comparing it to the libraries listed below
- This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.☆41Updated last year
- ☆34Updated 3 years ago
- Easy-to-use MIRAGE code for faithful answer attribution in RAG applications. Paper: https://aclanthology.org/2024.emnlp-main.347/☆21Updated this week
- Apps built using Inspired Cognition's Critique.☆58Updated 2 years ago
- ☆31Updated 8 months ago
- Code repository for the paper "Mission: Impossible Language Models."☆48Updated this week
- ☆16Updated 3 years ago
- Minimum Bayes Risk Decoding for Hugging Face Transformers☆56Updated 9 months ago
- CausalGym: Benchmarking causal interpretability methods on linguistic tasks☆41Updated 3 months ago
- The evaluation pipeline for the 2024 BabyLM Challenge.☆29Updated 4 months ago
- Code for preprint: Summarizing Differences between Text Distributions with Natural Language☆42Updated 2 years ago
- Repo for the paper "Large Language Models Struggle to Learn Long-Tail Knowledge"☆75Updated last year
- Evaluation pipeline for the BabyLM Challenge 2023.☆75Updated last year
- ☆24Updated last year
- A curated list of research papers and resources on Cultural LLM.☆39Updated 5 months ago
- ☆127Updated last month
- ☆24Updated 3 months ago
- ☆45Updated last year
- 🌾 Universal, customizable and deployable fine-grained evaluation for text generation.☆22Updated last year
- Teaching Models to Express Their Uncertainty in Words☆37Updated 2 years ago
- The LM Contamination Index is a manually created database of contamination evidences for LMs.☆77Updated 11 months ago
- ☆44Updated last year
- The geometry of multilingual language model representations (EMNLP 2022).☆19Updated 2 years ago
- Official repository for our EACL 2023 paper "LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization" (https…☆43Updated 7 months ago
- ☆38Updated last year
- Data for evaluating gender bias in coreference resolution systems.☆74Updated 5 years ago
- Code for the arXiv paper: "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"☆59Updated last month
- Data and code for the preprint "In-Context Learning with Long-Context Models: An In-Depth Exploration"☆31Updated 6 months ago
- Detect hallucinated tokens for conditional sequence generation.☆64Updated 2 years ago
- Code and data accompanying the paper "TRUE: Re-evaluating Factual Consistency Evaluation".☆75Updated this week