awslabs/mlm-scoring

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/awslabs/mlm-scoring)

awslabs / mlm-scoring

Python library & examples for Masked Language Model Scoring (ACL 2020)

☆350

Alternatives and similar repositories for mlm-scoring

Users that are interested in mlm-scoring are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ICASSP2021-tutorial9 / Distant_conversational_ASR_and_analysis
View on GitHub
☆12Jun 10, 2021Updated 5 years ago
alexwarstadt / blimp
View on GitHub
The Benchmark of Linguistic Minimal Pairs
☆170Dec 13, 2022Updated 3 years ago
simonepri / lm-scorer
View on GitHub
📃Language Model based sentences scoring library
☆312Jun 8, 2026Updated last month
Tiiiger / bert_score
View on GitHub
BERT score for text generation
☆1,909Jul 30, 2024Updated last year
facebookresearch / LAMA
View on GitHub
LAnguage Model Analysis
☆1,391Jul 7, 2024Updated 2 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
tanyuqian / ctc-gen-eval
View on GitHub
EMNLP 2021 - CTC: A Unified Framework for Evaluating Natural Language Generation
☆97Mar 20, 2023Updated 3 years ago
jefflai108 / Semi-Supervsied-Spoken-Language-Understanding-PyTorch
View on GitHub
Semi-supervised spoken language understanding (SLU) via self-supervised speech and language model pretraining
☆12Mar 23, 2021Updated 5 years ago
allenai / dont-stop-pretraining
View on GitHub
Code associated with the Don't Stop Pretraining ACL 2020 paper
☆543Nov 15, 2021Updated 4 years ago
chho33 / LAMOL
View on GitHub
Code for LAMOL: LAnguage MOdeling for Lifelong Language Learning
☆95Aug 28, 2020Updated 5 years ago
awslabs / speech-representations
View on GitHub
Code for DeCoAR (ICASSP 2020) and BERTphone (Odyssey 2020)
☆104Nov 26, 2022Updated 3 years ago
facebookresearch / SentAugment
View on GitHub
SentAugment is a data augmentation technique for NLP that retrieves similar sentences from a large bank of sentences. It can be used in c…
☆359Feb 22, 2022Updated 4 years ago
Chung-I / youtube-asr-crawler
View on GitHub
☆10Sep 19, 2022Updated 3 years ago
richarddwang / electra_pytorch
View on GitHub
Pretrain and finetune ELECTRA with fastai and huggingface. (Results of the paper replicated !)
☆332Jan 10, 2024Updated 2 years ago
google-research / bleurt
View on GitHub
BLEURT is a metric for Natural Language Generation based on transfer learning.
☆794Aug 4, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
uber-research / PPLM
View on GitHub
Plug and Play Language Model implementation. Allows to steer topic and attributes of GPT-2 models.
☆1,153Feb 20, 2024Updated 2 years ago
marcotcr / checklist
View on GitHub
Beyond Accuracy: Behavioral Testing of NLP models with CheckList
☆2,052Jan 9, 2024Updated 2 years ago
facebookresearch / Mask-Predict
View on GitHub
A masked language modeling objective to train a model to predict any subset of the target words, conditioned on both the input text and a…
☆246Sep 17, 2021Updated 4 years ago
chrisdonahue / ilm
View on GitHub
Easily fine tune GPT-2 to fill in missing text
☆203Dec 8, 2022Updated 3 years ago
microsoft / COCO-LM
View on GitHub
[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
☆118Jul 25, 2023Updated 2 years ago
chrisjbryant / errant
View on GitHub
ERRor ANnotation Toolkit: Automatically extract and classify grammatical errors in parallel original and corrected sentences.
☆466May 28, 2026Updated last month
cisnlp / simalign
View on GitHub
[EMNLP 2020] Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)
☆398Nov 7, 2023Updated 2 years ago
huggingface / olm-training
View on GitHub
Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.
☆98Feb 9, 2023Updated 3 years ago
alontalmor / LeapOfThought
View on GitHub
☆49Jun 12, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
czyssrs / Few-Shot-NLG
View on GitHub
Code and Data for ACL 2020 paper "Few-Shot NLG with Pre-Trained Language Model"
☆188May 23, 2025Updated last year
m-wiesner / nnet_pytorch
View on GitHub
Kaldi style neural network training in pytorch for use in place of nnet3 in Kaldi.
☆26Jul 25, 2024Updated 2 years ago
aalto-speech / subword-kaldi
View on GitHub
Properly handle position-dependent phones in a subword lexicon FST
☆31Oct 26, 2020Updated 5 years ago
facebookresearch / XLM
View on GitHub
PyTorch original implementation of Cross-lingual Language Model Pretraining.
☆2,923Feb 14, 2023Updated 3 years ago
voidful / asrp
View on GitHub
ASR text preprocessing utility
☆21Aug 5, 2024Updated last year
hainan-xv / PASM
View on GitHub
Pronunciation-assisted Subword Modeling
☆31May 30, 2019Updated 7 years ago
markusdr / transducersaurus
View on GitHub
Automatically exported from code.google.com/p/transducersaurus
☆11Apr 1, 2015Updated 11 years ago
valentinhofmann / flota
View on GitHub
☆18Feb 1, 2023Updated 3 years ago
TharinduDR / TransQuest
View on GitHub
Transformer based translation quality estimation
☆114Jul 20, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
facebookresearch / anli
View on GitHub
Adversarial Natural Language Inference Benchmark
☆402May 12, 2022Updated 4 years ago
tomohideshibata / BERT-related-papers
View on GitHub
BERT-related papers
☆2,033Aug 12, 2023Updated 2 years ago
awasthiabhijeet / PIE
View on GitHub
Fast + Non-Autoregressive Grammatical Error Correction using BERT. Code and Pre-trained models for paper "Parallel Iterative Edit Models …
☆233Mar 24, 2023Updated 3 years ago
google-research / multilingual-t5
View on GitHub
☆1,294Dec 15, 2022Updated 3 years ago
athena-team / athena-decoder
View on GitHub
☆76Mar 18, 2022Updated 4 years ago
google-research / electra
View on GitHub
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
☆2,367Mar 23, 2024Updated 2 years ago
WorksApplications / SudachiTra
View on GitHub
Japanese tokenizer for Transformers
☆80Dec 15, 2023Updated 2 years ago