coastalcph / hierarchical-transformers
Hierarchical Attention Transformers (HAT)
☆43Updated 8 months ago
Related projects: ⓘ
- ☆40Updated 2 years ago
- The official repository for Efficient Long-Text Understanding Using Short-Text Models (Ivgi et al., 2022) paper☆64Updated last year
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆91Updated last year
- Long-context pretrained encoder-decoder models☆95Updated last year
- Google's BigBird (Jax/Flax & PyTorch) @ 🤗Transformers☆47Updated last year
- ☆55Updated last year
- This repository contains the code for paper Prompting ELECTRA Few-Shot Learning with Discriminative Pre-Trained Models.☆45Updated 2 years ago
- An official repository for MIA 2022 (NAACL 2022 Workshop) Shared Task on Cross-lingual Open-Retrieval Question Answering.☆31Updated 2 years ago
- Embedding Recycling for Language models☆38Updated last year
- ☆50Updated 2 years ago
- The Multitask Long Document Benchmark☆38Updated last year
- The autoregressive information extraction system GenIE (Generative Information Extraction) implemented in PyTorch.☆98Updated last year
- Efficient Memory-Augmented Transformers☆34Updated last year
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆42Updated 10 months ago
- Simple Questions Generate Named Entity Recognition Datasets (EMNLP 2022)☆74Updated last year
- EMNLP 2021 - Frustratingly Simple Pretraining Alternatives to Masked Language Modeling☆31Updated 2 years ago
- Are foundation LMs multilingual knowledge bases? (EMNLP 2023)☆18Updated 9 months ago
- Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://a…☆46Updated 2 years ago
- Pre-training BART in Flax on The Pile dataset☆20Updated 3 years ago
- [ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuning☆97Updated last year
- ☆33Updated last year
- Interpreting Language Models with Contrastive Explanations (EMNLP 2022 Best Paper Honorable Mention)☆56Updated 2 years ago
- Using business-level retrieval system (BM25) with Python in just a few lines.☆31Updated last year
- Code associated with the paper "Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists"☆45Updated 2 years ago
- ☆94Updated last year
- Tutorial to pretrain & fine-tune a 🤗 Flax T5 model on a TPUv3-8 with GCP☆58Updated 2 years ago
- ☆20Updated 3 years ago
- Pytorch Implementation of EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks☆62Updated 2 years ago
- Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021☆34Updated 2 years ago
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.☆71Updated 2 years ago