hpcaitech / ColossalAI-Pytorch-lightningLinks

☆24

Alternatives and similar repositories for ColossalAI-Pytorch-lightning

Users that are interested in ColossalAI-Pytorch-lightning are comparing it to the libraries listed below

Sorting:

Lightning-Universe / lightning-ColossalAI
Large Scale Distributed Model Training strategy with Colossal AI and Lightning AI
☆56Updated 2 years ago
cimeister / typical-sampling
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
☆81Updated 3 years ago
microsoft / Efficient-Large-LM-Trainer
☆38Updated last year
sunyt32 / torchscale
Transformers at any scale
☆41Updated last year
facebookresearch / ELECTRA-Fewshot-Learning
This repository contains the code for paper Prompting ELECTRA Few-Shot Learning with Discriminative Pre-Trained Models.
☆48Updated 3 years ago
jason9693 / ETA4LLMs
Calculating Expected Time for training LLM.
☆38Updated 2 years ago
amazon-science / dq-bart
DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization (ACL 2022)
☆50Updated 2 years ago
nreimers / se-pytorch-xla
☆21Updated 4 years ago
lyutyuh / structured-span-selector
A Structured Span Selector (NAACL 2022). A structured span selector with a WCFG for span selection tasks (coreference resolution, semanti…
☆21Updated 3 years ago
JetRunner / PABEE
Code for the paper "BERT Loses Patience: Fast and Robust Inference with Early Exit".
☆66Updated 4 years ago
huggingface / olm-training
Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.
☆96Updated 2 years ago
yxuansu / Contrastive_Search_Is_What_You_Need
[TMLR'23] Contrastive Search Is What You Need For Neural Text Generation
☆121Updated 2 years ago
NormXU / Consistent-DynamicNTKRoPE
An Experiment on Dynamic NTK Scaling RoPE
☆64Updated last year
oriram / spider
☆54Updated 2 years ago
facebookresearch / bart_ls
Long-context pretrained encoder-decoder models
☆96Updated 3 years ago
facebookresearch / ketod
KETOD Knowledge-Enriched Task-Oriented Dialogue
☆32Updated 2 years ago
RUCAIBox / ELMER
This repository is the official implementation of our EMNLP 2022 paper ELMER: A Non-Autoregressive Pre-trained Language Model for Efficie…
☆26Updated 3 years ago
thunlp / TR-BERT
Source code for NAACL 2021 paper "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference"
☆48Updated 3 years ago
yxuansu / Contrastive_Search_versus_Contrastive_Decoding
An Empirical Study On Contrastive Search And Contrastive Decoding For Open-ended Text Generation
☆27Updated last year
seonghyeonye / TAPP
[AAAI 2024] Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following
☆78Updated last year
guilhermemr04 / scaling-zero-shot-retrieval
No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval
☆29Updated 3 years ago
allenai / data-efficient-finetuning
Code for paper 'Data-Efficient FineTuning'
☆28Updated 2 years ago
clovaai / length-adaptive-transformer
Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)
☆102Updated 5 years ago
IBM / PoWER-BERT
Method to improve inference time for BERT. This is an implementation of the paper titled "PoWER-BERT: Accelerating BERT Inference via Pro…
☆62Updated 2 months ago
Beomi / transformers-language-modeling
Train 🤗transformers with DeepSpeed: ZeRO-2, ZeRO-3
☆23Updated 4 years ago
lucidrains / coco-lm-pytorch
Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch
☆46Updated 4 years ago
jzbjyb / ReAtt
Retrieval as Attention
☆82Updated 2 years ago
FreedomIntelligence / DPTDR
Code for COLING22 paper, DPTDR: Deep Prompt Tuning for Dense Passage Retrieval
☆26Updated 2 years ago
microsoft / BANG
BANG is a new pretraining model to Bridge the gap between Autoregressive (AR) and Non-autoregressive (NAR) Generation. AR and NAR generat…
☆28Updated 3 years ago
SeanNaren / minGPT
A minimal PyTorch Lightning OpenAI GPT w DeepSpeed Training!
☆113Updated 2 years ago