Staged Training for Transformer Language Models
☆33Mar 31, 2022Updated 3 years ago
Alternatives and similar repositories for staged-training
Users that are interested in staged-training are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Source code for paper: Knowledge Inheritance for Pre-trained Language Models☆38Apr 24, 2022Updated 3 years ago
- This is the unofficial implementation of LEMON (ICLR'2024).☆12Apr 14, 2024Updated last year
- {DeepL, Google, WMT-Best, davinci-003, turbo, gpt-4} × {En-De, En-Cs, En-Ru, En-Zh, De-Fr, En-Ja, Uk-En, Uk-Cs, En-Hr, En-Ha, En-Is}☆14Jun 18, 2023Updated 2 years ago
- Resources related to EMNLP 2021 paper "FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations"☆13Dec 14, 2021Updated 4 years ago
- Tutorial on NE processing for Digital Humanities - DH Utrech 2019☆24Jul 18, 2019Updated 6 years ago
- T-Projection is a method to perform high-quality Annotation Projection of Sequence Labeling datasets.☆13Nov 21, 2023Updated 2 years ago
- ☆16May 14, 2024Updated last year
- ☆13Feb 12, 2023Updated 3 years ago
- The implementation for our paper, "Improving Simultaneous Machine Translation with Monolingual Data," accepted to AAAI 2023. 🎉☆12Jul 19, 2023Updated 2 years ago
- ☆14Jul 11, 2022Updated 3 years ago
- GC4LM: A Colossal (Biased) language model for German☆13May 2, 2021Updated 4 years ago
- LTG-Bert☆34Jan 8, 2024Updated 2 years ago
- Implementation of the paper 'Sentence Bottleneck Autoencoders from Transformer Language Models'☆17Mar 14, 2022Updated 4 years ago
- IsoBN: Fine-Tuning BERT with Isotropic Batch Normalization☆12Nov 23, 2021Updated 4 years ago
- UDapter is a multilingual dependency parser that uses "contextual" adapters together with language-typology features for language-specifi…☆31Dec 5, 2022Updated 3 years ago
- Meta Representation Transformation for Low-resource Cross-lingual Learning☆41May 5, 2021Updated 4 years ago
- Implementation of Cascaded Head-colliding Attention (ACL'2021)☆11Sep 16, 2021Updated 4 years ago
- “Style Transfer as Data Augmentation: A Case Study on Named Entity Recognition” (EMNLP 2022)☆16Feb 2, 2023Updated 3 years ago
- Overview of corpora/datasets for Germanic low-resource languages and dialects. Accompanies "A Survey of Corpora for Germanic Low-Resource…☆27Feb 16, 2026Updated last month
- Code and data for Distributional Correlation–Aware Knowledge Distillation for Stock Trading Volume Prediction (ECML-PKDD 22)☆14Sep 6, 2022Updated 3 years ago
- Getting interpretable dimensions in word embedding spaces.☆15Jul 6, 2023Updated 2 years ago
- Code for paper "Neural Semi-Markov Conditional Random Fields for Robust Character-Based Part-of-Speech Tagging"☆16May 31, 2019Updated 6 years ago
- Code of "Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model"☆23Jun 28, 2024Updated last year
- [NeurIPS 2022] "A Win-win Deal: Towards Sparse and Robust Pre-trained Language Models", Yuanxin Liu, Fandong Meng, Zheng Lin, Jiangnan Li…☆21Jan 9, 2024Updated 2 years ago
- Code and pretrained models for the paper: "MatMamba: A Matryoshka State Space Model"☆62Nov 21, 2024Updated last year
- Code and models for the paper titled "Better Feature Integration for Named Entity Recognition", NAACL 2021.☆30Nov 5, 2021Updated 4 years ago
- Named entity recognition for the legal domain☆43Jun 1, 2021Updated 4 years ago
- Code for ACL 2023 paper titled "Lifting the Curse of Capacity Gap in Distilling Language Models"☆30Jul 14, 2023Updated 2 years ago
- 💵 Code for Less is More for Long Document Summary Evaluation by LLMs (Wu*, Iso* et al; EACL 2024)☆11Feb 22, 2024Updated 2 years ago
- Implementation of our paper "Scaling Back-Translation with Domain Text Generation for Sign Language Gloss Translation". Accepted in EACL …☆11May 22, 2023Updated 2 years ago
- Group-conditional DRO to alleviate spurious correlations☆15Jul 15, 2021Updated 4 years ago
- SpyGame: An interactive multi-agent framework to evaluate intelligence with large language models :D☆15Nov 9, 2023Updated 2 years ago
- The dataset consists of public social media url pairs and the corresponding entailment label for an external conference (ACL 2021). Each …☆14Aug 16, 2021Updated 4 years ago
- PyTorch Language Modeling Toolkit for Fast Weight Programmers☆19Jun 11, 2025Updated 9 months ago
- Code for the Globetrotter project☆23Mar 17, 2022Updated 4 years ago
- Named entity annotation tool☆28Jul 6, 2023Updated 2 years ago
- Source code for the ACL-IJCNLP 2021 paper entitled "T-DNA: Taming Pre-trained Language Models with N-gram Representations for Low-Resourc…☆19Jan 12, 2023Updated 3 years ago
- Tagger for explicit cause-and-effect relationships in text☆11Jan 8, 2020Updated 6 years ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆96Feb 9, 2023Updated 3 years ago