huggingface / distill-bloom-deepspeedLinks
Teacher - student distillation using DeepSpeed
☆19Updated 3 years ago
Alternatives and similar repositories for distill-bloom-deepspeed
Users that are interested in distill-bloom-deepspeed are comparing it to the libraries listed below
Sorting:
- ☆128Updated last year
- Official implementation for 'Extending LLMs’ Context Window with 100 Samples'☆81Updated last year
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆45Updated last month
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated 2 years ago
- [ACL 2023] Gradient Ascent Post-training Enhances Language Model Generalization☆29Updated last year
- Pytorch implementation for "Compressed Context Memory For Online Language Model Interaction" (ICLR'24)☆63Updated last year
- [ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners☆116Updated 5 months ago
- Code for "Democratizing Reasoning Ability: Tailored Learning from Large Language Model", EMNLP 2023☆36Updated last year
- DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization (ACL 2022)☆50Updated 2 years ago
- Simple implementation of Speculative Sampling in NumPy for GPT-2.☆98Updated 2 years ago
- Language models scale reliably with over-training and on downstream tasks