zphang / transformersLinks
Code and models for BERT on STILTs
☆53Updated 2 years ago
Alternatives and similar repositories for transformers
Users that are interested in transformers are comparing it to the libraries listed below
Sorting:
- Inference script for Meta's LLaMA models using Hugging Face wrapper☆110Updated 2 years ago
- An Experiment on Dynamic NTK Scaling RoPE☆64Updated last year
- Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks☆209Updated last year
- Data preparation code for Amber 7B LLM☆90Updated last year
- Code repository for the c-BTM paper☆106Updated last year
- ☆96Updated 2 years ago
- ☆179Updated 2 years ago
- Pre-training code for Amber 7B LLM☆166Updated last year
- Experiments on speculative sampling with Llama models☆126Updated last year
- QLoRA: Efficient Finetuning of Quantized LLMs☆78Updated last year
- ☆105Updated last year
- code for Scaling Laws of RoPE-based Extrapolation☆73Updated last year
- Load multiple LoRA modules simultaneously and automatically switch the appropriate combination of LoRA modules to generate the best answe…☆151Updated last year
- Multipack distributed sampler for fast padding-free training of LLMs☆188Updated 9 months ago
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆202Updated last year
- Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper☆136Updated 10 months ago
- This is a text generation method which returns a generator, streaming out each token in real-time during inference, based on Huggingface/…☆95Updated last year
- Official implementation for 'Extending LLMs’ Context Window with 100 Samples'☆78Updated last year
- The data processing pipeline for the Koala chatbot language model☆117Updated 2 years ago
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆178Updated last year
- Code for paper titled "Towards the Law of Capacity Gap in Distilling Language Models"☆100Updated 10 months ago
- Unofficial implementation of AlpaGasus☆91Updated last year
- LLaMa Tuning with Stanford Alpaca Dataset using Deepspeed and Transformers☆51Updated 2 years ago
- FuseAI Project☆87Updated 4 months ago
- MultilingualShareGPT, the free multi-language corpus for LLM training☆72Updated 2 years ago
- MultilingualSIFT: Multilingual Supervised Instruction Fine-tuning☆89Updated last year
- Spherical Merge Pytorch/HF format Language Models with minimal feature loss.☆123Updated last year
- ☆76Updated last year
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆114Updated 2 years ago
- ☆269Updated 2 years ago