babylm / evaluation-pipeline-2024Links

The evaluation pipeline for the 2024 BabyLM Challenge.

☆33

Alternatives and similar repositories for evaluation-pipeline-2024

Users that are interested in evaluation-pipeline-2024 are comparing it to the libraries listed below

Sorting:

babylm / evaluation-pipeline-2023
Evaluation pipeline for the BabyLM Challenge 2023.
☆77Updated 2 years ago
tau-nlp / scrolls
The official code of EMNLP 2022, "SCROLLS: Standardized CompaRison Over Long Language Sequences".
☆69Updated last year
cpllab / syntactic-generalization
Code and data for "A Systematic Assessment of Syntactic Generalization in Neural Language Models"
☆29Updated 4 years ago
GEM-benchmark / GEM-metrics
Automatic metrics for GEM tasks
☆67Updated 3 years ago
lukemelas / mtob
☆41Updated last year
mega002 / ff-layers
The accompanying code for "Transformer Feed-Forward Layers Are Key-Value Memories". Mor Geva, Roei Schuster, Jonathan Berant, and Omer Le…
☆99Updated 4 years ago
machelreid / m2d2
M2D2: A Massively Multi-domain Language Modeling Dataset (EMNLP 2022) by Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyer
☆54Updated 2 years ago
jkallini / mission-impossible-language-models
Code repository for the paper "Mission: Impossible Language Models."
☆54Updated last month
kernelmachine / demix-data
Benchmark API for Multidomain Language Modeling
☆25Updated 3 years ago
martiansideofthemoon / rankgen
Official code and model checkpoints for our EMNLP 2022 paper "RankGen - Improving Text Generation with Large Ranking Models" (https://arx…
☆138Updated 2 years ago
facebookresearch / NPM
The original implementation of Min et al. "Nonparametric Masked Language Modeling" (paper https//arxiv.org/abs/2212.01349)
☆158Updated 2 years ago
google-research / t5x_retrieval
☆101Updated 2 years ago
nouhadziri / faith-and-fate
☆37Updated last year
huggingface / that_is_good_data
☆65Updated 2 years ago
nyu-mll / SQuALITY
Query-focused summarization data
☆42Updated 2 years ago
mega002 / lm-debugger
The official code of LM-Debugger, an interactive tool for inspection and intervention in transformer-based language models.
☆180Updated 3 years ago
allenai / Lila
A unified benchmark for math reasoning
☆89Updated 2 years ago
kernelmachine / demix
DEMix Layers for Modular Language Modeling
☆54Updated 4 years ago
ZurichNLP / mbr
Minimum Bayes Risk Decoding for Hugging Face Transformers
☆60Updated last year
nyu-mll / quality
☆141Updated 10 months ago
bigscience-workshop / multilingual-modeling
BLOOM+1: Adapting BLOOM model to support a new unseen language
☆74Updated last year
SimengSun / ChapterBreak
☆11Updated last year
sustcsonglin / TN-PCFG
source code of NAACL2021 "PCFGs Can Do Better: Inducing Probabilistic Context-Free Grammars with Many Symbols“ and ACL2021 main conferenc…
☆51Updated 7 months ago
ryokamoi / wice
This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.
☆41Updated last year
alisawuffles / DExperts
code associated with ACL 2021 DExperts paper
☆118Updated 2 years ago
awebson / prompt_semantics
This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”
☆85Updated 3 years ago
suzgunmirac / crowd-sampling
Follow the Wisdom of the Crowd: Effective Text Generation via Minimum Bayes Risk Decoding
☆18Updated 3 years ago
INK-USC / CrossFit
Code for paper "CrossFit : A Few-shot Learning Challenge for Cross-task Generalization in NLP" (https://arxiv.org/abs/2104.08835)
☆113Updated 3 years ago
naver / gdc
Code accompanying our papers on the "Generative Distributional Control" framework
☆118Updated 2 years ago
neulab / knn-transformers
PyTorch + HuggingFace code for RetoMaton: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022), including an…
☆282Updated 3 years ago