evintunador / minLlama3Links
a simplified version of Meta's Llama 3 model to be used for learning
☆41Updated last year
Alternatives and similar repositories for minLlama3
Users that are interested in minLlama3 are comparing it to the libraries listed below
Sorting:
- This repository contains an implementation of the LLaMA 2 (Large Language Model Meta AI) model, a Generative Pretrained Transformer (GPT)…☆67Updated last year
- Distributed training (multi-node) of a Transformer model☆68Updated last year
- Tutorial for how to build BERT from scratch☆93Updated last year
- LLaMA 2 implemented from scratch in PyTorch☆329Updated last year
- minimal GRPO implementation from scratch☆90Updated 2 months ago
- Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget☆151Updated last year
- Research projects built on top of Transformers☆53Updated 2 months ago
- Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind☆178Updated 8 months ago
- A set of scripts and notebooks on LLM finetunning and dataset creation☆111Updated 8 months ago
- Reference implementation of Mistral AI 7B v0.1 model.☆29Updated last year
- Training code for Baby-Llama, our submission to the strict-small track of the BabyLM challenge.☆80Updated last year
- Spherical Merge Pytorch/HF format Language Models with minimal feature loss.☆124Updated last year
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…☆221Updated 7 months ago
- Starter pack for NeurIPS LLM Efficiency Challenge 2023.☆122Updated last year
- Official repository for ORPO☆453Updated last year
- RL significantly the reasoning capability of Qwen2.5-1.5B-Instruct☆29Updated 3 months ago
- a simplified version of Google's Gemma model to be used for learning☆25Updated last year
- Building a 2.3M-parameter LLM from scratch with LLaMA 1 architecture.☆170Updated last year
- [ACL'24] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning☆356Updated 9 months ago
- Notes and commented code for RLHF (PPO)☆96Updated last year
- Pre-training code for Amber 7B LLM☆166Updated last year
- A simplified implementation for experimenting with RLVR on GSM8K, This repository provides a starting point for exploring reasoning.☆97Updated 4 months ago
- Training and Fine-tuning an llm in Python and PyTorch.☆42Updated last year
- LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch☆105Updated last year
- Embed arbitrary modalities (images, audio, documents, etc) into large language models.☆184Updated last year
- Training small GPT-2 style models using Kolmogorov-Arnold networks.☆117Updated last year
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆203Updated last year
- LLaMA 3 is one of the most promising open-source model after Mistral, we will recreate it's architecture in a simpler manner.☆166Updated 9 months ago
- From scratch implementation of a vision language model in pure PyTorch☆220Updated last year
- Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper☆136Updated 10 months ago