Staged Training for Transformer Language Models
☆33Mar 31, 2022Updated 4 years ago
Alternatives and similar repositories for staged-training
Users that are interested in staged-training are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Masked Structural Growth for 2x Faster Language Model Pre-training☆25Apr 28, 2024Updated last year
- decontamination☆30Mar 4, 2026Updated last month
- ☆16May 6, 2021Updated 4 years ago
- {DeepL, Google, WMT-Best, davinci-003, turbo, gpt-4} × {En-De, En-Cs, En-Ru, En-Zh, De-Fr, En-Ja, Uk-En, Uk-Cs, En-Hr, En-Ha, En-Is}☆14Jun 18, 2023Updated 2 years ago
- Resources related to EMNLP 2021 paper "FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations"☆13Dec 14, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Tutorial on NE processing for Digital Humanities - DH Utrech 2019☆24Jul 18, 2019Updated 6 years ago
- T-Projection is a method to perform high-quality Annotation Projection of Sequence Labeling datasets.☆13Nov 21, 2023Updated 2 years ago
- ☆16May 14, 2024Updated last year
- ☆13Feb 12, 2023Updated 3 years ago
- The implementation for our paper, "Improving Simultaneous Machine Translation with Monolingual Data," accepted to AAAI 2023. 🎉☆12Jul 19, 2023Updated 2 years ago
- ☆14Jul 11, 2022Updated 3 years ago
- GC4LM: A Colossal (Biased) language model for German☆13May 2, 2021Updated 4 years ago
- LTG-Bert☆34Jan 8, 2024Updated 2 years ago
- Implementation of the paper 'Sentence Bottleneck Autoencoders from Transformer Language Models'☆17Mar 14, 2022Updated 4 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- IsoBN: Fine-Tuning BERT with Isotropic Batch Normalization☆12Nov 23, 2021Updated 4 years ago
- UDapter is a multilingual dependency parser that uses "contextual" adapters together with language-typology features for language-specifi…☆31Dec 5, 2022Updated 3 years ago
- “Style Transfer as Data Augmentation: A Case Study on Named Entity Recognition” (EMNLP 2022)☆16Feb 2, 2023Updated 3 years ago
- Overview of corpora/datasets for Germanic low-resource languages and dialects. Accompanies "A Survey of Corpora for Germanic Low-Resource…☆27Feb 16, 2026Updated last month
- Code and data for Distributional Correlation–Aware Knowledge Distillation for Stock Trading Volume Prediction (ECML-PKDD 22)☆15Sep 6, 2022Updated 3 years ago
- Getting interpretable dimensions in word embedding spaces.☆15Jul 6, 2023Updated 2 years ago
- Code for paper "Neural Semi-Markov Conditional Random Fields for Robust Character-Based Part-of-Speech Tagging"☆16May 31, 2019Updated 6 years ago
- Code of "Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model"☆23Jun 28, 2024Updated last year
- Code and models for the paper titled "Better Feature Integration for Named Entity Recognition", NAACL 2021.☆30Nov 5, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Code and pretrained models for the paper: "MatMamba: A Matryoshka State Space Model"☆63Nov 21, 2024Updated last year
- Code for ACL 2023 paper titled "Lifting the Curse of Capacity Gap in Distilling Language Models"☆30Jul 14, 2023Updated 2 years ago
- Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision☆11Jul 22, 2024Updated last year
- 💵 Code for Less is More for Long Document Summary Evaluation by LLMs (Wu*, Iso* et al; EACL 2024)☆11Feb 22, 2024Updated 2 years ago
- Group-conditional DRO to alleviate spurious correlations☆15Jul 15, 2021Updated 4 years ago
- SpyGame: An interactive multi-agent framework to evaluate intelligence with large language models :D☆15Nov 9, 2023Updated 2 years ago
- Set-Equivariant Deep Learning Models☆22Dec 23, 2021Updated 4 years ago
- Named entity annotation tool☆28Jul 6, 2023Updated 2 years ago
- A collection of notebooks for Natural Language Processing☆25Jan 13, 2025Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Tagger for explicit cause-and-effect relationships in text☆11Jan 8, 2020Updated 6 years ago
- (ACL-IJCNLP 2021) Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models.☆21Jul 13, 2022Updated 3 years ago
- PyTorch Language Modeling Toolkit for Fast Weight Programmers☆21Jun 11, 2025Updated 10 months ago
- Implementation of our paper "Exploiting Unsupervised Data for Emotion Recognition in Conversations" in the Findings of EMNLP-2020.☆13Nov 17, 2020Updated 5 years ago
- Convert Transkribus PAGE-XML to standard PAGE-XML☆12Dec 10, 2025Updated 4 months ago
- ☆15Jul 16, 2021Updated 4 years ago
- ☆32Aug 4, 2021Updated 4 years ago