clabrugere / scratch-llm
Implements a LLM similar to Meta's Llama 2 from the ground up in PyTorch, for educational purposes.
☆35Updated 3 months ago
Alternatives and similar repositories for scratch-llm
Users that are interested in scratch-llm are comparing it to the libraries listed below
Sorting:
- Benchmarking PyTorch 2.0 different models☆21Updated 2 years ago
- Microsoft Phi 2 Streamlit App, deployed on HuggingFace Spaces is based on the Microsoft Phi 2 small language model (SLM) for text generat…☆14Updated last year
- ☆17Updated last year
- ☆15Updated last year
- ☆14Updated 11 months ago
- Finetuning BLOOM on a single GPU using gradient-accumulation☆31Updated 2 years ago
- TPU use in single line in colab using tf2 package.☆11Updated 3 years ago
- Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)☆28Updated last year
- A gzip-based text-classification system.☆33Updated last year
- ☆15Updated 4 months ago
- Supplementary material for our paper "Compute Trends Across Three Eras of Machine Learning".☆40Updated 3 years ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆34Updated last week
- Experiments with BitNet inference on CPU☆55Updated last year
- Lightweight Llama 3 8B Inference Engine in CUDA C☆47Updated last month
- ML/DL Math and Method notes☆60Updated last year
- ☆18Updated 3 months ago
- Manages vllm-nccl dependency☆17Updated 11 months ago
- Pre-train BERT from scratch, with HuggingFace. Accompanies the blog post: sidsite.com/posts/bert-from-scratch☆40Updated last year
- Gemma2(9B), Llama3-8B-Finetune-and-RAG, code base for sample, implemented in Kaggle platform☆21Updated 3 months ago
- LLaMA implementation for HuggingFace Transformers☆38Updated 2 years ago
- ☆17Updated last year
- Training and Fine-tuning an llm in Python and PyTorch.☆41Updated last year
- Tutorial for LLM developers about engine design, service deployment, evaluation/benchmark, etc. Provide a C/S style optimized LLM inferen…☆19Updated last year
- Some experiments on transformer models☆11Updated last year
- Inference Llama 2 in C++☆44Updated last year
- minimal LLM scripts for 24GB VRAM GPUs. training, inference, whatever☆38Updated last month
- A collection of reproducible inference engine benchmarks☆30Updated 3 weeks ago
- PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"☆23Updated 3 weeks ago
- Low-Rank Adaptation of Large Language Models clean implementation☆8Updated last year
- ☆17Updated 9 months ago