evintunador / minLlama3Links
a simplified version of Meta's Llama 3 model to be used for learning
☆41Updated last year
Alternatives and similar repositories for minLlama3
Users that are interested in minLlama3 are comparing it to the libraries listed below
Sorting:
- This repository contains an implementation of the LLaMA 2 (Large Language Model Meta AI) model, a Generative Pretrained Transformer (GPT)…☆68Updated last year
- A family of compressed models obtained via pruning and knowledge distillation☆343Updated 7 months ago
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆198Updated 11 months ago
- A project to improve skills of large language models☆429Updated this week
- Training and Fine-tuning an llm in Python and PyTorch.☆42Updated last year
- LLaMA 2 implemented from scratch in PyTorch☆336Updated last year
- Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind☆177Updated 9 months ago
- Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget☆152Updated last year
- minimal GRPO implementation from scratch☆90Updated 3 months ago
- [ACL'24] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning☆357Updated 9 months ago
- A compact LLM pretrained in 9 days by using high quality data☆314Updated 2 months ago
- ☆174Updated 5 months ago
- Distributed training (multi-node) of a Transformer model☆72Updated last year
- Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper☆137Updated 11 months ago
- LoRA and DoRA from Scratch Implementations☆204Updated last year
- The official evaluation suite and dynamic data release for MixEval.☆243Updated 7 months ago
- Notes and commented code for RLHF (PPO)☆96Updated last year
- Pre-training code for Amber 7B LLM☆166Updated last year
- Training code for Baby-Llama, our submission to the strict-small track of the BabyLM challenge.☆80Updated last year
- Collection of autoregressive model implementation☆85Updated 2 months ago
- This is the official repository for Inheritune.☆111Updated 4 months ago
- Building a 2.3M-parameter LLM from scratch with LLaMA 1 architecture.☆175Updated last year
- Unofficial implementation of https://arxiv.org/pdf/2407.14679☆45Updated 9 months ago
- [ACL 2024] Progressive LLaMA with Block Expansion.☆505Updated last year
- Tutorial for how to build BERT from scratch☆94Updated last year
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆158Updated 2 months ago
- Spherical Merge Pytorch/HF format Language Models with minimal feature loss.☆129Updated last year
- The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".☆173Updated 2 months ago
- ☆39Updated last month
- Reference implementation of Mistral AI 7B v0.1 model.☆28Updated last year