bminixhofer / zettLinks
Code for Zero-Shot Tokenizer Transfer
☆128Updated 4 months ago
Alternatives and similar repositories for zett
Users that are interested in zett are comparing it to the libraries listed below
Sorting:
- ☆38Updated last year
- [ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners☆116Updated 8 months ago
- Prune transformer layers☆69Updated last year
- Language models scale reliably with over-training and on downstream tasks☆97Updated last year
- Minimum Bayes Risk Decoding for Hugging Face Transformers☆58Updated last year
- Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback☆96Updated last year
- This is the official repository for Inheritune.☆111Updated 3 months ago
- The HELMET Benchmark☆149Updated last month
- Organize the Web: Constructing Domains Enhances Pre-Training Data Curation☆50Updated last month
- A toolkit implementing advanced methods to transfer models and model knowledge across tokenizers.☆24Updated 2 weeks ago
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…☆75Updated 9 months ago
- Evaluation pipeline for the BabyLM Challenge 2023.☆75Updated last year
- [ACL 2024] LangBridge: Multilingual Reasoning Without Multilingual Supervision☆89Updated 7 months ago
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆43Updated last year
- ☆125Updated last year
- Multipack distributed sampler for fast padding-free training of LLMs☆188Updated 9 months ago
- A repository containing the code for translating popular LLM benchmarks to German.☆25Updated last year
- Repo accompanying our paper "Do Llamas Work in English? On the Latent Language of Multilingual Transformers".☆76Updated last year
- ☆72Updated last year
- This repository contains the code used for the experiments in the paper "Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity…☆25Updated last year
- Code repository for the c-BTM paper☆106Updated last year
- Understand and test language model architectures on synthetic tasks.☆197Updated 2 months ago
- Code release for Dataless Knowledge Fusion by Merging Weights of Language Models (https://openreview.net/forum?id=FCnohuR6AnM)☆89Updated last year
- Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"☆84Updated last year
- Code for the ACL 2023 paper: "Rethinking the Role of Scale for In-Context Learning: An Interpretability-based Case Study at 66 Billion Sc…☆30Updated last year
- Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch☆170Updated 5 months ago
- A fast implementation of T5/UL2 in PyTorch using Flash Attention☆105Updated 2 months ago
- Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning☆30Updated 2 years ago
- ☆131Updated 6 months ago
- Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind☆178Updated 8 months ago