asahi417 / lmppl
Calculate perplexity on a text with pre-trained language models. Support MLM (eg. DeBERTa), recurrent LM (eg. GPT3), and encoder-decoder LM (eg. Flan-T5).
☆155Updated 6 months ago
Alternatives and similar repositories for lmppl:
Users that are interested in lmppl are comparing it to the libraries listed below
- Source Code of Paper "GPTScore: Evaluate as You Desire"☆246Updated 2 years ago
- Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning"☆167Updated 3 years ago
- Codebase, data and models for the SummaC paper in TACL☆91Updated 2 months ago
- ☆174Updated 2 years ago
- Token-level Reference-free Hallucination Detection☆94Updated last year
- Multilingual Large Language Models Evaluation Benchmark☆123Updated 8 months ago
- Repository for EMNLP 2022 Paper: Towards a Unified Multi-Dimensional Evaluator for Text Generation☆198Updated last year
- Tk-Instruct is a Transformer model that is tuned to solve many NLP tasks by following instructions.☆180Updated 2 years ago
- What's In My Big Data (WIMBD) - a toolkit for analyzing large text datasets☆218Updated 5 months ago
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆128Updated last year
- contrastive decoding☆199Updated 2 years ago
- Fact-Checking the Output of Generative Large Language Models in both Annotation and Evaluation.☆95Updated last year
- ☆73Updated last year
- Finetune mistral-7b-instruct for sentence embeddings☆81Updated 11 months ago
- ☆278Updated last year
- Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"☆170Updated 4 months ago
- Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs☆240Updated 10 months ago
- This is the repository of HaluEval, a large-scale hallucination evaluation benchmark for Large Language Models.☆463Updated last year
- A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic…☆340Updated last week
- RARR: Researching and Revising What Language Models Say, Using Language Models☆46Updated last year
- ☆132Updated 3 months ago
- [EMNLP 2023] The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning☆238Updated last year
- ACL 2022: An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models.☆137Updated 4 months ago
- Code for Editing Factual Knowledge in Language Models☆137Updated 3 years ago
- A Multilingual Replicable Instruction-Following Model☆93Updated last year
- ACL2023 - AlignScore, a metric for factual consistency evaluation.☆127Updated last year
- Code and data for "Lost in the Middle: How Language Models Use Long Contexts"☆340Updated last year
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆100Updated last year
- ☆72Updated 4 months ago
- ☆182Updated last year