naklecha / llama3-from-scratch
llama3 implementation one matrix multiplication at a time
☆14,088Updated 8 months ago
Alternatives and similar repositories for llama3-from-scratch:
Users that are interested in llama3-from-scratch are comparing it to the libraries listed below
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.☆9,366Updated 7 months ago
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We als…☆16,111Updated this week
- Finetune Llama 3.3, DeepSeek-R1, Mistral, Phi-4 & Gemma 2 LLMs 2-5x faster with 70% less memory☆23,278Updated this week
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆39,060Updated last month
- LLM training in simple, raw C/CUDA☆25,227Updated 4 months ago
- Inference Llama 2 in one file of pure C☆17,974Updated 6 months ago
- SGLang is a fast serving framework for large language models and vision language models.☆8,509Updated this week
- Implement a ChatGPT-like LLM in PyTorch from scratch, step by step☆39,031Updated this week
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆11,453Updated this week
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆22,098Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆36,497Updated this week
- Machine Learning Engineering Open Book☆12,636Updated this week
- Structured Text Generation☆10,594Updated this week
- PyTorch native post-training library☆4,789Updated this week
- 🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.☆17,165Updated this week
- Go ahead and axolotl questions☆8,484Updated this week
- Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.☆45,426Updated 2 weeks ago
- Train transformer language models with reinforcement learning.☆11,140Updated this week
- Tensor library for machine learning☆11,728Updated this week
- Official inference library for Mistral models☆9,913Updated 2 months ago
- Fast and memory-efficient exact attention☆15,318Updated this week
- [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.☆21,299Updated 5 months ago
- DSPy: The framework for programming—not prompting—language models☆21,675Updated this week
- Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sag…☆16,975Updated this week
- Run any open-source LLMs, such as Llama, Mistral, as OpenAI compatible API endpoint in the cloud.☆10,504Updated this week
- A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers)☆9,675Updated 8 months ago
- QLoRA: Efficient Finetuning of Quantized LLMs☆10,210Updated 7 months ago
- [ICLR 2024] Efficient Streaming Language Models with Attention Sinks☆6,782Updated 6 months ago
- ☆4,058Updated 8 months ago
- tiktoken is a fast BPE tokeniser for use with OpenAI's models.☆13,255Updated 4 months ago