viai957 / llama-inference
View external linksLinks

A simple implementation of Llama 1, 2. Llama Architecture built from scratch using PyTorch all the models are built from scratch that includes GQA (Grouped Query Attention) , RoPE (Rotary Positional Embeddings) , RMS Norm, FeedForward Block, Encoder (as this is only for Inferencing the model) , SwiGLU (Activation Function),
13May 6, 2024Updated last year

Alternatives and similar repositories for llama-inference

Users that are interested in llama-inference are comparing it to the libraries listed below

Sorting:

Are these results useful?