clabrugere / scratch-llmLinks
Implements a LLM similar to Meta's Llama 2 from the ground up in PyTorch, for educational purposes.
☆37Updated 5 months ago
Alternatives and similar repositories for scratch-llm
Users that are interested in scratch-llm are comparing it to the libraries listed below
Sorting:
- Gemma2(9B), Llama3-8B-Finetune-and-RAG, code base for sample, implemented in Kaggle platform☆22Updated 5 months ago
- Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)☆28Updated 2 years ago
- ☆14Updated last year
- Fast and memory-efficient exact attention ported to rocm☆11Updated last year
- minimal scripts for 24GB VRAM GPUs. training, inference, whatever☆41Updated last month
- This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog po…☆92Updated 2 years ago
- Various test models in WNNX format. It can view with `pip install wnetron && wnetron`☆12Updated 3 years ago
- Manages vllm-nccl dependency☆17Updated last year
- Microsoft Phi 2 Streamlit App, deployed on HuggingFace Spaces is based on the Microsoft Phi 2 small language model (SLM) for text generat…☆14Updated last year
- Playground for Transformers☆51Updated last year
- A collection of reproducible inference engine benchmarks☆32Updated 3 months ago
- PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"☆24Updated 3 weeks ago
- Make triton easier☆47Updated last year
- several types of attention modules written in PyTorch for learning purposes☆53Updated 9 months ago
- Benchmarking PyTorch 2.0 different models☆21Updated 2 years ago
- Library for the Test-based Calibration Error (TCE) metric to quantify the degree to classifier calibration.☆13Updated last year
- Utilities for Training Very Large Models☆58Updated 9 months ago
- a curated list of the role of small models in the LLM era☆102Updated 10 months ago
- Repository containing awesome resources regarding Hugging Face tooling.☆47Updated last year
- Create a source of truth for ML model results and browse it on Papers with Code☆32Updated 4 years ago
- Inference Llama 2 in one file of pure C++☆83Updated last year
- Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…☆55Updated 3 weeks ago
- Advanced Ultra-Low Bitrate Compression Techniques for the LLaMA Family of LLMs☆110Updated last year
- Mixtral finetuning☆19Updated last year
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models☆69Updated last year
- Multi-Layer Key-Value sharing experiments on Pythia models☆33Updated last year
- code for paper "Accessing higher dimensions for unsupervised word translation"☆21Updated 2 years ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆33Updated 2 months ago
- Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.☆10Updated last year
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated last year