salesforce / jaxformerLinks

Minimal library to train LLMs on TPU in JAX with pjit().

☆292

Alternatives and similar repositories for jaxformer

Users that are interested in jaxformer are comparing it to the libraries listed below

Sorting:

bigcode-project / bigcode-analysis
Repository for analysis and experiments in the BigCode project.
☆121Updated last year
loubnabnl / santacoder-finetuning
Fine-tune SantaCoder for Code/Text Generation.
☆192Updated 2 years ago
salesforce / CodeGen2
CodeGen2 models for program synthesis
☆272Updated 2 years ago
openai / human-eval-infilling
Code for the paper "Efficient Training of Language Models to Fill in the Middle"
☆183Updated 2 years ago
Sea-Snell / JAX_llama
Inference code for LLaMA models in JAX
☆118Updated last year
bigcode-project / Megatron-LM
Ongoing research training transformer models at scale
☆390Updated 11 months ago
EleutherAI / DeeperSpeed
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
☆168Updated 2 weeks ago
lm-sys / llm-decontaminator
Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"
☆306Updated last year
salesforce / CodeRL
This is the official code for the paper CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning (Neur…
☆541Updated 6 months ago
Sea-Snell / JAXSeq
Train very large language models in Jax.
☆206Updated last year
CarperAI / Code-Pile
This repository contains all the code for collecting large scale amounts of code from GitHub.
☆110Updated 2 years ago
nlpxucan / evol-instruct
☆270Updated 2 years ago
imoneoi / multipack_sampler
Multipack distributed sampler for fast padding-free training of LLMs
☆199Updated 11 months ago
Zyq-scut / RLTF
Accepted by Transactions on Machine Learning Research (TMLR)
☆130Updated 10 months ago
huggingface / datablations
Scaling Data-Constrained Language Models
☆338Updated last month
llm-efficiency-challenge / neurips_llm_efficiency_challenge
NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day
☆256Updated last year
CarperAI / cheese
Used for adaptive human in the loop evaluation of language and embedding models.
☆311Updated 2 years ago
bigcode-project / octopack
🐙 OctoPack: Instruction Tuning Code Large Language Models
☆472Updated 6 months ago
young-geng / koala_data_pipeline
The data processing pipeline for the Koala chatbot language model
☆117Updated 2 years ago
huggingface / bloom-jax-inference
☆67Updated 3 years ago
huggingface / optimum-graphcore
Blazing fast training of 🤗 Transformers on Graphcore IPUs
☆85Updated last year
google-research / babelcode
☆52Updated 5 months ago
sanjeevanahilan / nanoChatGPT
A crude RLHF layer on top of nanoGPT with Gumbel-Softmax trick
☆291Updated last year
GammaTauAI / leetcode-hard-gym
A hard gym for programming
☆159Updated last year
google / flaxformer
☆361Updated last year
LAION-AI / Open-Instruction-Generalist
Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks
☆208Updated last year
SeanNaren / min-LLM
Minimal code to train a Large Language Model (LLM).
☆171Updated 3 years ago
haoliuhl / chain-of-hindsight
Simple next-token-prediction for RLHF
☆227Updated last year
abacaj / code-eval
Run evaluation on LLMs using human-eval benchmark
☆417Updated last year
bigcode-project / the-stack-v2
Code for the curation of The Stack v2 and StarCoder2 training data
☆114Updated last year