zphang / transformersLinks

Code and models for BERT on STILTs

☆52

Alternatives and similar repositories for transformers

Users that are interested in transformers are comparing it to the libraries listed below

Sorting:

LAION-AI / Open-Instruction-Generalist
Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks
☆209Updated last year
zsc / llama_infer
Inference script for Meta's LLaMA models using Hugging Face wrapper
☆110Updated 2 years ago
LLM360 / amber-train
Pre-training code for Amber 7B LLM
☆169Updated last year
DachengLi1 / LongChat
Official repository for LongChat and LongEval
☆531Updated last year
neelsjain / NEFTune
Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning
☆404Updated last year
liutiedong / goat
a Fine-tuned LLaMA that is Good at Arithmetic Tasks
☆178Updated 2 years ago
huggingface / transformers-bloom-inference
Fast Inference Solutions for BLOOM
☆564Updated last year
HuangLK / transpeeder
train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism
☆225Updated 2 years ago
young-geng / koala_data_pipeline
The data processing pipeline for the Koala chatbot language model
☆118Updated 2 years ago
bigscience-workshop / xmtf
Crosslingual Generalization through Multitask Finetuning
☆537Updated last year
linhduongtuan / BLOOM-LORA
Due to restriction of LLaMA, we try to reimplement BLOOM-LoRA (much less restricted BLOOM license here https://huggingface.co/spaces/bigs…
☆184Updated 2 years ago
shjwudp / c4-dataset-script
Inspired by google c4, here is a series of colossal clean data cleaning scripts focused on CommonCrawl data processing. Including Chinese…
☆131Updated 2 years ago
facebookresearch / Shepherd
This is the repo for the paper Shepherd -- A Critic for Language Model Generation
☆219Updated 2 years ago
FreedomIntelligence / MultilingualSIFT
MultilingualSIFT: Multilingual Supervised Instruction Fine-tuning
☆94Updated 2 years ago
dwzhu-pku / PoSE
Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)
☆204Updated last year
gpt4life / alpagasus
Unofficial implementation of AlpaGasus
☆93Updated 2 years ago
orhonovich / unnatural-instructions
☆180Updated 2 years ago
nlpxucan / evol-instruct
☆275Updated 2 years ago
imoneoi / multipack
Multipack distributed sampler for fast padding-free training of LLMs
☆201Updated last year
FranxYao / Long-Context-Data-Engineering
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
☆477Updated last year
OpenLMLab / scaling-rope
code for Scaling Laws of RoPE-based Extrapolation
☆73Updated 2 years ago
Dahoas / reward-modeling
☆98Updated 2 years ago
galatolofederico / vanilla-llama
Plain pytorch implementation of LLaMA
☆188Updated 2 years ago
akoksal / LongForm
Reverse Instructions to generate instruction tuning data with corpus examples
☆216Updated last year
jondurbin / bagel
A bagel, with everything.
☆324Updated last year
LowinLi / transformers-stream-generator
This is a text generation method which returns a generator, streaming out each token in real-time during inference, based on Huggingface/…
☆97Updated last year
kaistAI / SelFee
Official codebase for "SelFee: Iterative Self-Revising LLM Empowered by Self-Feedback Generation"
☆228Updated 2 years ago
bojone / rerope
Rectified Rotary Position Embeddings
☆384Updated last year
THUDM / icetk
A unified tokenization tool for Images, Chinese and English.
☆153Updated 2 years ago
dust-tt / llama-ssp
Experiments on speculative sampling with Llama models
☆126Updated 2 years ago