zphang / transformersLinks
Code and models for BERT on STILTs
☆52Updated 2 years ago
Alternatives and similar repositories for transformers
Users that are interested in transformers are comparing it to the libraries listed below
Sorting:
- Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks☆209Updated 2 years ago
- Pre-training code for Amber 7B LLM☆170Updated last year
- Inference script for Meta's LLaMA models using Hugging Face wrapper☆110Updated 2 years ago
- ☆98Updated 2 years ago
- Inspired by google c4, here is a series of colossal clean data cleaning scripts focused on CommonCrawl data processing. Including Chinese…☆135Updated 2 years ago
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆409Updated last year
- Crosslingual Generalization through Multitask Finetuning☆537Updated last year
- train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism☆224Updated 2 years ago
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆178Updated 2 years ago
- ☆279Updated 2 years ago
- Code used for sourcing and cleaning the BigScience ROOTS corpus☆318Updated 2 years ago
- Due to restriction of LLaMA, we try to reimplement BLOOM-LoRA (much less restricted BLOOM license here https://huggingface.co/spaces/bigs…☆184Updated 2 years ago
- This is the repo for the paper Shepherd -- A Critic for Language Model Generation☆222Updated 2 years ago
- Rectified Rotary Position Embeddings☆388Updated last year
- Official repository for LongChat and LongEval☆534Updated last year
- code for Scaling Laws of RoPE-based Extrapolation☆73Updated 2 years ago
- ☆106Updated 2 years ago
- A bagel, with everything.☆326Updated last year
- A unified tokenization tool for Images, Chinese and English.☆153Updated 2 years ago
- ☆180Updated 2 years ago
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆209Updated last year
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆69Updated 2 years ago
- Reverse Instructions to generate instruction tuning data with corpus examples☆216Updated last year
- Open Source WizardCoder Dataset☆163Updated 2 years ago
- Scaling Data-Constrained Language Models☆341Updated 7 months ago
- The aim of this repository is to utilize LLaMA to reproduce and enhance the Stanford Alpaca☆98Updated 2 years ago
- Multipack distributed sampler for fast padding-free training of LLMs☆204Updated last year
- Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper☆150Updated last year
- [ICLR 2023] Codebase for Copy-Generator model, including an implementation of kNN-LM☆190Updated last year
- Fast Inference Solutions for BLOOM☆566Updated last year