Code repository for the paper "MrT5: Dynamic Token Merging for Efficient Byte-level Language Models."
☆54Sep 25, 2025Updated 5 months ago
Alternatives and similar repositories for mrt5
Users that are interested in mrt5 are comparing it to the libraries listed below
Sorting:
- PathPiece tokenizer☆13Nov 10, 2024Updated last year
- Linear Attention for Efficient Bidirectional Sequence Modeling☆15May 13, 2025Updated 9 months ago
- Efficient encoder-decoder architecture for small language models (≤1B parameters) with cross-architecture knowledge distillation and visi…☆33Feb 7, 2025Updated last year
- Digital texts in Prakrit☆10Sep 14, 2025Updated 5 months ago
- ☆10Nov 18, 2024Updated last year
- Landing repository for the paper "Softpick: No Attention Sink, No Massive Activations with Rectified Softmax"☆88Sep 12, 2025Updated 5 months ago
- ☆13Sep 6, 2022Updated 3 years ago
- Adding new tasks to T0 without catastrophic forgetting☆33Oct 20, 2022Updated 3 years ago
- Tool to perform paired evaluation of automatic systems☆13Oct 20, 2021Updated 4 years ago
- Official repository for BMVC 2022 paper: Global Proxy-based Hard Mining for Visual Place Recognition☆18Mar 7, 2023Updated 2 years ago
- triple-encoders is a library for contextualizing distributed Sentence Transformers representations.☆15Sep 3, 2024Updated last year
- Datapunt open panorama project☆14May 6, 2024Updated last year
- SCT: An Efficient Self-Supervised Cross-View Training For Sentence Embedding (TACL)☆16Jul 27, 2024Updated last year
- Study the temporal performance degradation of machine learning models.☆16Jan 26, 2024Updated 2 years ago
- The offcial repository for 'CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos', SIGI…☆16May 4, 2022Updated 3 years ago
- ☆18Jun 12, 2023Updated 2 years ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆39Jun 11, 2025Updated 8 months ago
- [NeurIPS 2024] Image Understanding Makes for A Good Tokenizer for Image Generation☆22Dec 17, 2024Updated last year
- An NLP pipeline for Hebrew☆41Jun 16, 2025Updated 8 months ago
- [NeurIPS 2023 Main Track] This is the repository for the paper titled "Don’t Stop Pretraining? Make Prompt-based Fine-tuning Powerful Lea…☆76Feb 4, 2024Updated 2 years ago
- Experiments for efforts to train a new and improved t5☆76Apr 15, 2024Updated last year
- A Framework aims to wisely initialize unseen subword embeddings in PLMs for efficient large-scale continued pretraining☆18Nov 26, 2023Updated 2 years ago
- Official repository of the paper "JIST: Joint Image and Sequence Training for Sequential Visual Place Recognition"☆23Dec 15, 2023Updated 2 years ago
- Fuel innovation and advance language models with HomoScriptor: A vibrant, community-driven dataset for fine-tuning large language models.☆18Oct 14, 2023Updated 2 years ago
- Official repository of "Distort, Distract, Decode: Instruction-Tuned Model Can Refine its Response from Noisy Instructions", ICLR 2024 Sp…☆21Mar 7, 2024Updated last year
- ☆18Jul 25, 2024Updated last year
- ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward exp…☆226Sep 18, 2025Updated 5 months ago
- Large language model Mistral for DNA☆21Sep 12, 2025Updated 5 months ago
- Task Compass: Scaling Multi-task Pre-training with Task Prefix (EMNLP 2022: Findings) (stay tuned & more will be updated)☆22Oct 17, 2022Updated 3 years ago
- Visual Place Recognition☆31Nov 25, 2025Updated 3 months ago
- BirdSpotter is a python package which provides an influence and bot detection toolkit for twitter.☆19Mar 10, 2021Updated 4 years ago
- ☆20Mar 30, 2022Updated 3 years ago
- Statewide Visual Geolocalization in the Wild (ECCV 2024)☆73Dec 2, 2024Updated last year
- ☆45Jul 5, 2022Updated 3 years ago
- A sytem for Named Entity Disambiguation based on Random Walks and Learning to Rank.☆19Feb 26, 2022Updated 4 years ago
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…☆28Apr 17, 2024Updated last year
- Python package for Natural Language Processing (NLP), focused on low-resource languages spoken in Mexico.☆23Sep 4, 2025Updated 6 months ago
- [NeurIPS 2023] The implementation of paper "Empowering Collaborative Filtering Generalization via Principled Adversarial Contrastive Loss…☆20Feb 21, 2024Updated 2 years ago
- ☆25Feb 20, 2026Updated last week