zphang / transformers
Code and models for BERT on STILTs
☆53Updated last year
Alternatives and similar repositories for transformers:
Users that are interested in transformers are comparing it to the libraries listed below
- Inference script for Meta's LLaMA models using Hugging Face wrapper☆111Updated last year
- Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks☆208Updated last year
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆66Updated 3 months ago
- An Experiment on Dynamic NTK Scaling RoPE☆62Updated last year
- Unofficial implementation of AlpaGasus☆90Updated last year
- Instruct-tune Open LLaMA / RedPajama / StableLM models on consumer hardware using QLoRA☆80Updated last year
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆115Updated last year
- ☆106Updated last year
- The data processing pipeline for the Koala chatbot language model☆117Updated last year
- FuseAI Project☆80Updated this week
- Official implementation for 'Extending LLMs’ Context Window with 100 Samples'☆76Updated last year
- ☆74Updated last year
- Code for paper titled "Towards the Law of Capacity Gap in Distilling Language Models"☆97Updated 6 months ago
- MultilingualSIFT: Multilingual Supervised Instruction Fine-tuning☆89Updated last year
- GPTQLoRA: Efficient Finetuning of Quantized LLMs with GPTQ☆99Updated last year
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆204Updated 8 months ago
- code for Scaling Laws of RoPE-based Extrapolation☆70Updated last year
- ☆177Updated last year
- This is a text generation method which returns a generator, streaming out each token in real-time during inference, based on Huggingface/…☆96Updated 10 months ago
- LLaMa Tuning with Stanford Alpaca Dataset using Deepspeed and Transformers☆50Updated last year
- train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism☆213Updated last year
- Pre-training code for Amber 7B LLM☆160Updated 8 months ago
- All available datasets for Instruction Tuning of Large Language Models☆241Updated last year
- Experiments on speculative sampling with Llama models☆123Updated last year
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆177Updated last year
- Source code for ACL 2023 paper Decoder Tuning: Efficient Language Understanding as Decoding☆48Updated last year
- ☆96Updated last year
- The aim of this repository is to utilize LLaMA to reproduce and enhance the Stanford Alpaca☆96Updated last year
- Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper☆125Updated 6 months ago
- [ACL'24 Outstanding] Data and code for L-Eval, a comprehensive long context language models evaluation benchmark☆369Updated 6 months ago