gmontamat / poor-mans-transformers
Implement Transformers (and Deep Learning) from scratch in NumPy
☆23Updated last year
Related projects ⓘ
Alternatives and complementary repositories for poor-mans-transformers
- A numpy implementation of the Transformer model in "Attention is All You Need"☆50Updated 4 months ago
- LLaMA 2 implemented from scratch in PyTorch☆258Updated last year
- Simple Adaptation of BitNet☆29Updated 7 months ago
- ☆68Updated 8 months ago
- Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch☆226Updated 2 months ago
- LoRA and DoRA from Scratch Implementations☆188Updated 8 months ago
- Techniques used to run BLOOM at inference in parallel☆37Updated 2 years ago
- An open collection of implementation tips, tricks and resources for training large language models☆460Updated last year
- JAX implementation of the Llama 2 model☆210Updated 9 months ago
- Everything you want to know about Google Cloud TPU☆496Updated 4 months ago
- ☆14Updated last year
- Implementation of Block Recurrent Transformer - Pytorch☆213Updated 3 months ago
- Fast bare-bones BPE for modern tokenizer training☆142Updated last month
- Like picoGPT but for BERT.☆50Updated last year
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆115Updated last year
- An implementation of masked language modeling for Pytorch, made as concise and simple as possible☆177Updated last year
- A set of scripts and notebooks on LLM finetunning and dataset creation☆93Updated last month
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆222Updated 3 weeks ago
- Deep learning library implemented from scratch in numpy. Mixtral, Mamba, LLaMA, GPT, ResNet, and other experiments.☆48Updated 7 months ago
- Experiments with generating opensource language model assistants☆97Updated last year
- A repository for log-time feedforward networks