fattorib / Little-GPT

GPT* - Training faster small transformers using ALiBi, Parallel Residual Connections and more!
23Updated last year

Related projects: