byungdoh / llm_surprisal
Surprisal calculation using HuggingFace LMs ("Frequency Explains the Inverse Correlation of Large Language Models’ Size, Training Data Amount, and Surprisal’s Fit to Reading Times," EACL24)
☆10Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for llm_surprisal
- SAP Benchmark☆13Updated 2 months ago
- ☆11Updated 2 years ago
- A psycholinguistic modeling toolkit☆24Updated last week
- A neural language model that estimates incremental processing complexity☆39Updated 3 years ago
- ☆21Updated 3 years ago
- ☆19Updated 3 years ago
- Unsupervised Grammar Induction with Combinatory Categorial Grammars☆10Updated 3 years ago
- ☆24Updated 6 months ago
- Code and Results for "Universals of word order reflect optimization of grammars for efficient communication"☆12Updated 2 years ago
- Scripts to evaluate scoped meaning representations☆19Updated 2 years ago
- Analysis pipeline for Revisiting UID (EMNLP 2021)☆10Updated 2 years ago
- The Universal Decompositional Semantics (UDS) dataset and the Decomp toolkit☆56Updated last year
- Diagnostic tests for linguistic capacities in language models☆66Updated 2 years ago
- Tetra-Tagging: Word-Synchronous Parsing with Linear-Time Inference☆15Updated 4 years ago
- A repository for high-quality QASRL data collected from crowd-workers.☆11Updated last year
- A repository for the EMNLP 2021 paper "Is Information Density Uniform in Task-Oriented Dialogues?" and for the CoNLL 2021 paper "Analysin…☆10Updated 5 months ago
- Code and data for "A Systematic Assessment of Syntactic Generalization in Neural Language Models"☆24Updated 3 years ago
- Corpus of naturalistic stories with annotation and psycholinguistic measures☆50Updated 3 years ago
- STREUSLE: a corpus with comprehensive lexical semantic annotation (multiword expressions, supersenses)☆63Updated last year
- Data and scripts for the proper evaluation of cross-lingual embeddings in multiple languages☆13Updated 4 years ago
- A simple, Python-based, command-line runner for MGIZA++.☆11Updated 2 years ago
- several algorithms for converting dependency structures into constituency structures.☆9Updated 2 years ago
- ☆16Updated 2 years ago
- UFSAC is a resource containing all WordNet Sense Annotated Corpora, and a Java library for manipulating them☆37Updated 2 years ago
- A framework for nonlinear continuous-time regression☆30Updated 6 months ago
- Organized inventory of research using the Abstract Meaning Representation☆36Updated this week
- ☆36Updated 5 years ago
- ☆30Updated last month
- Datasets for the Monolingual Word Sense Alignment (MWSA) task☆12Updated 4 years ago
- Evaluating Text Representations on Lexical Composition☆24Updated 5 years ago