manueldeprada / Pretraining-T5-PyTorch-LightningView external linksLinks
Collection of scripts to pretrain T5 in unsupervised text, using PyTorch Lightning. CORD-19 pretraining provided as example.
☆32Apr 26, 2021Updated 4 years ago
Alternatives and similar repositories for Pretraining-T5-PyTorch-Lightning
Users that are interested in Pretraining-T5-PyTorch-Lightning are comparing it to the libraries listed below
Sorting:
- Continue Pretraining T5 on custom dataset based on available pretrained model checkpoints☆38Mar 21, 2021Updated 4 years ago
- ☆22Nov 25, 2021Updated 4 years ago
- ☆13Jun 19, 2021Updated 4 years ago
- The official repository for Dynamic Clustering and Cluster Contrastive Learning (DCCC).☆14Dec 15, 2023Updated 2 years ago
- ☆13Oct 21, 2021Updated 4 years ago
- 简单的挖矿病毒查杀脚本☆19Apr 4, 2022Updated 3 years ago
- ☆45Sep 12, 2021Updated 4 years ago
- Hugging Face RoBERTa with Flash Attention 2☆24Sep 14, 2025Updated 5 months ago
- ☆23Feb 6, 2022Updated 4 years ago
- ☆26Aug 14, 2022Updated 3 years ago
- Source code for "Towards Hierarchical Importance Attribution: Explaining Compositional Semantics for Neural Sequence Models", ICLR 2020.☆30Jun 28, 2020Updated 5 years ago
- A simple example for finetuning HuggingFace T5 model. Includes code for intermediate generation.☆26Nov 11, 2020Updated 5 years ago
- Winning solution for the Kaggle Feedback Prize Challenge.☆66Sep 5, 2022Updated 3 years ago
- 数据合成工具,简单高效的合成不同业务场景的大模型训练数据☆39Jan 2, 2025Updated last year
- A Python Terminal script for displaying Corporate filings on BSE exchange.☆19Feb 28, 2024Updated last year
- A benchmark on predicting how small molecules change gene expression in different cell types.☆13Jul 4, 2025Updated 7 months ago
- Artifact code release for paper "Uniform-Cost Multi-Path Routing for Reconfigurable Data Center Networks"☆12Sep 5, 2024Updated last year
- Introduction page of a challenging text-to-SQL dataset: KaggleDBQA☆42Sep 20, 2023Updated 2 years ago
- CFBench: A Comprehensive Constraints-Following Benchmark for LLMs☆47Aug 26, 2024Updated last year
- ☆34Oct 30, 2020Updated 5 years ago
- 天池 新冠疫情相似句对判定大赛 top6方案☆77Jun 22, 2022Updated 3 years ago
- ☆14Jul 5, 2023Updated 2 years ago
- Human ID classification using mmwave radar point cloud☆13Oct 18, 2025Updated 3 months ago
- Official implementation of the paper "ALTER: Augmentation for Large-Table-Based Reasoning"☆15Aug 26, 2024Updated last year
- pytorch版损失函数,改写自科学空间文章,【通过互信息思想来缓解类别不平衡问题】、【将“softmax+交叉熵”推广到多标签分类问题 】☆12Aug 22, 2021Updated 4 years ago
- The official implementation of the paper "Text Classification in the Wild: a Large-scale Long-tailed Name Normalization Dataset"(ICASSP 2…☆12Feb 19, 2023Updated 2 years ago
- The codes for our ACL'22 paper: PRBOOST: Prompt-Based Rule Discovery and Boosting for Interactive Weakly-Supervised Learning.☆35Mar 18, 2022Updated 3 years ago
- Code Roberta version of RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder☆10Mar 16, 2023Updated 2 years ago
- Real-time multi-language unit test generation tool via LSP☆31Updated this week
- Structural displacement monitoring using ground-based synthetic aperture radar: Implementation of 3D displacement vector☆14Mar 6, 2024Updated last year
- python programs and procedures that facilitate local application of the earth2observe global water resources reanalysis☆10Nov 21, 2017Updated 8 years ago
- Token-free Language Modeling with ByGPT5 & Friends!☆12Jul 18, 2025Updated 6 months ago
- rule matcher (context free grammar)☆10Dec 27, 2019Updated 6 years ago
- Codebase for the MITO mmWave dataset☆18Oct 26, 2025Updated 3 months ago
- This is the official implementation of the paper: "Contrastive Learning of Sentence Embeddings from Scratch"☆40Jun 9, 2023Updated 2 years ago
- ☆44Mar 3, 2023Updated 2 years ago
- ☆34Mar 22, 2021Updated 4 years ago
- [NAACL'22] TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning☆94Jun 8, 2022Updated 3 years ago
- A novel incremental hierarchical clustering algorithm (KDD 22)☆10Aug 31, 2023Updated 2 years ago