amazon-science / dq-bartLinks

DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization (ACL 2022)

☆50

Alternatives and similar repositories for dq-bart

Users that are interested in dq-bart are comparing it to the libraries listed below

Sorting:

facebookresearch / ELECTRA-Fewshot-Learning
This repository contains the code for paper Prompting ELECTRA Few-Shot Learning with Discriminative Pre-Trained Models.
☆48Updated 3 years ago
huggingface / olm-training
Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.
☆96Updated 2 years ago
oriram / spider
☆54Updated 2 years ago
jzbjyb / ReAtt
Retrieval as Attention
☆82Updated 2 years ago
thunlp / TR-BERT
Source code for NAACL 2021 paper "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference"
☆48Updated 3 years ago
cambridgeltl / composable-sft
A library for parameter-efficient and composable transfer learning for NLP with sparse fine-tunings.
☆75Updated last year
hpcaitech / ColossalAI-Pytorch-lightning
☆24Updated 3 years ago
huawei-noah / Efficient-NLP
☆95Updated last year
cooelf / CompassMTL
Task Compass: Scaling Multi-task Pre-training with Task Prefix (EMNLP 2022: Findings) (stay tuned & more will be updated)
☆22Updated 3 years ago
yxuansu / Contrastive_Search_Is_What_You_Need
[TMLR'23] Contrastive Search Is What You Need For Neural Text Generation
☆121Updated 2 years ago
JetRunner / PABEE
Code for the paper "BERT Loses Patience: Fast and Robust Inference with Early Exit".
☆66Updated 4 years ago
sunyt32 / torchscale
Transformers at any scale
☆41Updated last year
ThomasScialom / T0_continual_learning
Adding new tasks to T0 without catastrophic forgetting
☆33Updated 3 years ago
NormXU / Consistent-DynamicNTKRoPE
An Experiment on Dynamic NTK Scaling RoPE
☆64Updated last year
Rojak-NLP / LLM-Code-Mixing
Can LLMs generate code-mixed sentences through zero-shot prompting?
☆11Updated 2 years ago
IBM / PoWER-BERT
Method to improve inference time for BERT. This is an implementation of the paper titled "PoWER-BERT: Accelerating BERT Inference via Pro…
☆62Updated 2 months ago
Nicolas-BZRD / llm-recipes
☆29Updated last year
seonghyeonye / TAPP
[AAAI 2024] Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following
☆78Updated last year
ghomasHudson / muld
The Multitask Long Document Benchmark
☆41Updated 3 years ago
google-research / t5x_retrieval
☆101Updated 2 years ago
bigscience-workshop / multilingual-modeling
BLOOM+1: Adapting BLOOM model to support a new unseen language
☆74Updated last year
JetRunner / LaPraDoR
🦮 Code and pretrained models for Findings of ACL 2022 paper "LaPraDoR: Unsupervised Pretrained Dense Retriever for Zero-Shot Text Retrie…
☆49Updated 3 years ago
allenai / data-efficient-finetuning
Code for paper 'Data-Efficient FineTuning'
☆28Updated 2 years ago
yxuansu / Contrastive_Search_versus_Contrastive_Decoding
An Empirical Study On Contrastive Search And Contrastive Decoding For Open-ended Text Generation
☆27Updated last year
yixinL7 / SumLLM
Repo for "On Learning to Summarize with Large Language Models as References"
☆43Updated 2 years ago
joeljang / ELM
[ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuning
☆98Updated 2 years ago
cimeister / typical-sampling
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
☆81Updated 3 years ago
osainz59 / t5-encoder
A extension of Transformers library to include T5ForSequenceClassification class.
☆40Updated 2 years ago
martiansideofthemoon / rankgen
Official code and model checkpoints for our EMNLP 2022 paper "RankGen - Improving Text Generation with Large Ranking Models" (https://arx…
☆138Updated 2 years ago
Lightning-Universe / lightning-ColossalAI
Large Scale Distributed Model Training strategy with Colossal AI and Lightning AI
☆56Updated 2 years ago