joeljang / Pretraining_T5_custom_dataset
Continue Pretraining T5 on custom dataset based on available pretrained model checkpoints
☆39Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for Pretraining_T5_custom_dataset
- Code, datasets, and checkpoints for the paper "Improving Passage Retrieval with Zero-Shot Question Generation (EMNLP 2022)"☆96Updated last year
- Code and models for the paper "Questions Are All You Need to Train a Dense Passage Retriever (TACL 2023)"☆61Updated last year
- ☆26Updated 2 years ago
- Code base of In-Context Learning for Dialogue State tracking☆44Updated last year
- Repository for ACL'22 paper: Dynamic Latent Extraction for Abstractive Long-Input Summarization☆55Updated last year
- Code and Data Repo for ACL'23 Paper "Element-aware Summary and Summary Chain-of-Thought (SumCoT)"☆53Updated 10 months ago
- Open-WikiTable :Dataset for Open Domain Question Answering with Complex Reasoning over Table☆19Updated last year
- Collection of scripts to pretrain T5 in unsupervised text, using PyTorch Lightning. CORD-19 pretraining provided as example.☆31Updated 3 years ago
- We construct and introduce DIALFACT, a testing benchmark dataset crowd-annotated conversational claims, paired with pieces of evidence fr…☆41Updated 2 years ago
- The code implementation of the EMNLP2022 paper: DisCup: Discriminator Cooperative Unlikelihood Prompt-tuning for Controllable Text Gene…☆25Updated last year
- A comprehensive paper list of Reasoning over Tables.☆26Updated 2 years ago
- Code for the ACL 2024 paper "PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning"☆11Updated 9 months ago
- Long-context pretrained encoder-decoder models☆95Updated 2 years ago
- First explanation metric (diagnostic report) for text generation evaluation☆61Updated 4 months ago
- [ACL 2024] LangBridge: Multilingual Reasoning Without Multilingual Supervision☆81Updated 3 weeks ago
- Technical Report: Is ChatGPT a Good NLG Evaluator? A Preliminary Study☆42Updated last year
- Dataset for TACL 2022 paper: "FeTaQA: Free-form Table Question Answering"☆80Updated last year
- Official code for "Continual Prompt Tuning for Dialog State Tracking" (ACL 2022).☆27Updated last year
- Code for the paper Code for the paper InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning☆97Updated last year
- [EMNLP 2022] Salience Allocation as Guidance for Abstractive Summarization☆23Updated last year
- [NeurIPS 2024] Train LLMs with diverse system messages reflecting individualized preferences to generalize to unseen system messages☆37Updated last month
- ☆30Updated 11 months ago
- Script to pre-train hugginface transformers BART with Tensorflow 2☆33Updated last year
- ☆43Updated last year
- [WWW 2024] The official repo for paper "Scalable and Effective Generative Information Retrieval".☆52Updated 6 months ago
- HANNA, a large annotated dataset of Human-ANnotated NArratives for ASG evaluation.☆28Updated last month
- The official code of TACL 2021, "Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies".☆63Updated 2 years ago
- MultiWOZ 2.4: A Multi-Domain Task-Oriented Dialogue Dataset☆61Updated 2 years ago
- IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our foc…☆28Updated 5 months ago
- ☆9Updated 2 months ago