philschmid / knowledge-distillation-transformers-pytorch-sagemaker
☆46Updated 3 years ago
Alternatives and similar repositories for knowledge-distillation-transformers-pytorch-sagemaker:
Users that are interested in knowledge-distillation-transformers-pytorch-sagemaker are comparing it to the libraries listed below
- Finetune mistral-7b-instruct for sentence embeddings☆81Updated 11 months ago
- Prune transformer layers☆68Updated 10 months ago
- [NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don't…☆110Updated 9 months ago
- LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)☆134Updated 5 months ago
- A framework for few-shot evaluation of autoregressive language models.☆104Updated last year
- Spherical Merge Pytorch/HF format Language Models with minimal feature loss.☆120Updated last year
- DSIR large-scale data selection framework for language model training☆246Updated last year
- Official implementation for 'Extending LLMs’ Context Window with 100 Samples'☆76Updated last year
- MultilingualSIFT: Multilingual Supervised Instruction Fine-tuning☆90Updated last year
- Code for Zero-Shot Tokenizer Transfer☆127Updated 3 months ago
- Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback☆94Updated last year
- ☆145Updated last year
- BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.☆198Updated 2 weeks ago
- ☆127Updated 5 months ago
- ☆255Updated last year
- Manage scalable open LLM inference endpoints in Slurm clusters☆254Updated 9 months ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆142Updated 7 months ago
- This repository contains the joint use of CPO and SimPO method for better reference-free preference learning methods.☆53Updated 8 months ago
- Unofficial implementation of AlpaGasus☆90Updated last year
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆148Updated 7 months ago
- AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark☆137Updated 4 months ago
- Multilingual Large Language Models Evaluation Benchmark☆123Updated 8 months ago
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆76Updated 6 months ago
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆139Updated 6 months ago
- ☆94Updated last year
- The official evaluation suite and dynamic data release for MixEval.☆235Updated 5 months ago
- Official PyTorch implementation of DistiLLM: Towards Streamlined Distillation for Large Language Models (ICML 2024)☆213Updated last month
- ☆120Updated 6 months ago
- Repo for the EMNLP'24 Paper "Dual-Space Knowledge Distillation for Large Language Models". A general white-box KD framework for both same…☆47Updated 5 months ago
- Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"☆74Updated 10 months ago