Calculate perplexity on a text with pre-trained language models. Support MLM (eg. DeBERTa), recurrent LM (eg. GPT3), and encoder-decoder LM (eg. Flan-T5).
☆167Jun 20, 2025Updated 10 months ago
Alternatives and similar repositories for lmppl
Users that are interested in lmppl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆29Mar 20, 2024Updated 2 years ago
- M2D2: A Massively Multi-domain Language Modeling Dataset (EMNLP 2022) by Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyer☆54Nov 21, 2022Updated 3 years ago
- ☆15Nov 20, 2025Updated 5 months ago
- Using BERT to calculate perplexity☆20Dec 20, 2019Updated 6 years ago
- Lite Self-Training☆30Jul 25, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Official code repository for Correct-N-Contrast☆23Jul 18, 2022Updated 3 years ago
- Word acquisition in neural language models (TACL 2022).☆20Jan 30, 2025Updated last year
- Japanese LLaMa experiment☆54Dec 27, 2025Updated 4 months ago
- Difference-based Contrastive Learning for Korean Sentence Embeddings☆23Mar 11, 2026Updated last month
- ☆13Dec 1, 2021Updated 4 years ago
- CSS-LM: Contrastive Semi-supervised Fine-tuning of Pre-trained Language Models☆12Jul 1, 2023Updated 2 years ago
- Code Roberta version of RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder☆10Mar 16, 2023Updated 3 years ago
- [NeurIPS 2022] Non-Linguistic Supervision for Contrastive Learning of Sentence Embeddings☆22Jan 30, 2023Updated 3 years ago
- ☆24Apr 8, 2019Updated 7 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ☆14Feb 9, 2022Updated 4 years ago
- Robust and Memory Efficient Event Detection and Tracking in Large News Feeds☆13Oct 15, 2021Updated 4 years ago
- [NAACL'22] TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning☆94Jun 8, 2022Updated 3 years ago
- A repository for experiments in quality-aware decoding☆18Jun 7, 2022Updated 3 years ago
- ☆24Nov 22, 2022Updated 3 years ago
- This repository provides the code and dataset for the work published in the paper - Modeling Label Semantics for Predicting Emotional Rea…☆26Nov 8, 2020Updated 5 years ago
- ☆13Apr 5, 2026Updated last month
- [AAAI 2024] History Matters: Temporal Knowledge Editing in Large Language Model☆14Dec 17, 2023Updated 2 years ago
- albumentations test☆11Jun 23, 2020Updated 5 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Forked repo from https://github.com/EleutherAI/lm-evaluation-harness/commit/1f66adc☆81Feb 28, 2024Updated 2 years ago
- ☆21Mar 28, 2022Updated 4 years ago
- A simple semi-supervised approach for creating huggingface data script loaders and upload to the hub.☆11Jun 23, 2024Updated last year
- ☆42Oct 29, 2024Updated last year
- Pytorch Tutorial for M1 students. This repository include Encoder Deocder model and Classification model building code.☆12Jun 1, 2022Updated 3 years ago
- Code for EMNLP 2021 paper: Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting☆17Nov 30, 2021Updated 4 years ago
- Arabic Word-Embedding (Word2vec) model training from Wikipedia articles☆11Dec 13, 2018Updated 7 years ago
- A library for evaluation of Grammatical Error Correction (GEC). Accepted to ACL'25 Demo: "gec-metrics: A Unified Library for Grammatical …☆14Jan 25, 2026Updated 3 months ago
- Official Implementation of "Simulating Environments with Reasoning Models for Agent Training"☆65Feb 18, 2026Updated 2 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- End-to-end codebase for finetuning LLMs (LLaMA 2, 3, etc.) with or without DP☆17Sep 23, 2024Updated last year
- A accurate multilingual word aligner based on LaBSE☆24Oct 25, 2023Updated 2 years ago
- coFR: COreference resolution tool for FRench (and singletons).☆27Jun 7, 2020Updated 5 years ago
- Optimization methods☆30Jan 5, 2015Updated 11 years ago
- A powerful text cleaner for Japanese web texts☆12Jan 20, 2024Updated 2 years ago
- Entitypedia is an Extended Named Entity Dictionary from Wikipedia.☆13Dec 7, 2022Updated 3 years ago
- ☆19Apr 26, 2026Updated last week