bkitano / llama-from-scratchLinks
Llama from scratch, or How to implement a paper without crying
☆567Updated last year
Alternatives and similar repositories for llama-from-scratch
Users that are interested in llama-from-scratch are comparing it to the libraries listed below
Sorting:
- What would you do with 1000 H100s...☆1,050Updated last year
- From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)☆715Updated 7 months ago
- LLM Workshop by Sourab Mangrulkar☆381Updated 11 months ago
- Minimalistic large language model 3D-parallelism training☆1,898Updated last week
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.☆795Updated last month
- Minimalistic 4D-parallelism distributed training framework for education purpose☆1,518Updated this week
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,387Updated last year
- A family of open-sourced Mixture-of-Experts (MoE) Large Language Models☆1,534Updated last year
- Puzzles for learning Triton☆1,671Updated 6 months ago
- nanoGPT style version of Llama 3.1☆1,372Updated 9 months ago
- [ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding☆1,251Updated 3 months ago
- The repository for the code of the UltraFastBERT paper☆516Updated last year
- Best practices for distilling large language models.☆547Updated last year
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,396Updated last week
- A repository for research on medium sized language models.☆497Updated last month
- GPU programming related news and material links☆1,540Updated 5 months ago
- ☆518Updated 6 months ago
- Implementing DeepSeek R1's GRPO algorithm from scratch☆1,390Updated last month
- Building blocks for foundation models.☆502Updated last year
- Code for BLT research paper☆1,675Updated 2 weeks ago
- A comprehensive deep dive into the world of tokens☆224Updated 11 months ago
- LLM papers I'm reading, mostly on inference and model compression☆730Updated last year
- Fast & Simple repository for pre-training and fine-tuning T5-style models☆1,005Updated 9 months ago
- ☆536Updated 9 months ago
- Fine-tune mistral-7B on 3090s, a100s, h100s☆713Updated last year
- Official implementation of Half-Quadratic Quantization (HQQ)☆818Updated this week
- LLaMA 2 implemented from scratch in PyTorch☆329Updated last year
- YaRN: Efficient Context Window Extension of Large Language Models☆1,495Updated last year
- LoRA and DoRA from Scratch Implementations☆204Updated last year
- ☆1,210Updated 3 months ago