clankur / einygptLinks
a transformer implemented primarily using einops and trained on the tinystories dataset
β12Updated last year
Alternatives and similar repositories for einygpt
Users that are interested in einygpt are comparing it to the libraries listed below
Sorting:
- PANiC - PAraphrasing Noun-Compoundsβ15Updated 7 years ago
- Tutorial to pretrain & fine-tune a π€ Flax T5 model on a TPUv3-8 with GCPβ58Updated 3 years ago
- Experiments with generating opensource language model assistantsβ97Updated 2 years ago
- Observe the slow deterioration of my mental sanity in the github commit historyβ12Updated 2 years ago
- Code for the paper-"Mirostat: A Perplexity-Controlled Neural Text Decoding Algorithm" (https://arxiv.org/abs/2007.14966).β60Updated 3 years ago
- Official code for ACL 2023 (short, findings) paper "Recursion of Thought: A Divide and Conquer Approach to Multi-Context Reasoning with Lβ¦β43Updated 2 years ago
- Minimum Bayes Risk Decoding for Hugging Face Transformersβ58Updated last year
- A highly sophisticated sequence-to-sequence model for code generationβ40Updated 4 years ago
- M2D2: A Massively Multi-domain Language Modeling Dataset (EMNLP 2022) by Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyerβ54Updated 2 years ago
- A toolkit implementing advanced methods to transfer models and model knowledge across tokenizers.β41Updated last month
- BLOOM+1: Adapting BLOOM model to support a new unseen languageβ73Updated last year
- Evaluation pipeline for the BabyLM Challenge 2023.β76Updated last year
- A Benchmark Dataset for Understanding Disfluencies in Question Answeringβ63Updated 4 years ago
- π Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignmentβ12Updated 4 months ago
- Official repository for our EACL 2023 paper "LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization" (httpsβ¦β44Updated last year
- β44Updated 8 months ago
- This tool helps automatic generation of grammatically valid synthetic Code-mixed data by utilizing linguistic theories such as Equivalencβ¦β55Updated last year
- Code & data for EMNLP 2020 paper "MOCHA: A Dataset for Training and Evaluating Reading Comprehension Metrics".β16Updated 3 years ago
- β39Updated last year
- Repository for "I am a Strange Dataset: Metalinguistic Tests for Language Models"β44Updated last year
- Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loβ¦β39Updated last year
- Repository for the code of the "PPL-MCTS: Constrained Textual Generation Through Discriminator-Guided Decoding" paper, NAACL'22β66Updated 2 years ago
- β21Updated last year
- Codes and files for the paper Are Emergent Abilities in Large Language Models just In-Context Learningβ33Updated 7 months ago
- Applying "Load What You Need: Smaller Versions of Multilingual BERT" to LaBSEβ18Updated 3 years ago
- A Framework aims to wisely initialize unseen subword embeddings in PLMs for efficient large-scale continued pretrainingβ18Updated last year
- Multi-Domain Expert Learningβ67Updated last year
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"β58Updated 2 years ago
- As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)β48Updated 4 years ago
- Code for SaGe subword tokenizer (EACL 2023)β25Updated 8 months ago