knotgrass / How-Transformers-WorkLinks
π§ A study guide to learn about Transformers
β11Updated last year
Alternatives and similar repositories for How-Transformers-Work
Users that are interested in How-Transformers-Work are comparing it to the libraries listed below
Sorting:
- This repository contains an implementation of the LLaMA 2 (Large Language Model Meta AI) model, a Generative Pretrained Transformer (GPT)β¦β68Updated last year
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for freeβ231Updated 7 months ago
- β39Updated last month
- A set of scripts and notebooks on LLM finetunning and dataset creationβ110Updated 9 months ago
- Fine-tuning Open-Source LLMs for Adaptive Machine Translationβ80Updated last month
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language Mβ¦β225Updated 7 months ago
- β174Updated 5 months ago
- A repository to unravel the language of GPUs, making their kernel conversations easy to understandβ185Updated 3 weeks ago
- β193Updated 4 months ago
- Prune transformer layersβ69Updated last year
- This repository contains the code for dataset curation and finetuning of instruct variant of the Bilingual OpenHathi model. The resultinβ¦β23Updated last year
- Tutorial for how to build BERT from scratchβ94Updated last year
- experiments with inference on llamaβ104Updated last year
- An extension of the nanoGPT repository for training small MOE models.β152Updated 3 months ago
- A collection of LogitsProcessors to customize and enhance LLM behavior for specific tasks.β304Updated this week
- A Multilingual Replicable Instruction-Following Modelβ93Updated 2 years ago
- LoRA and DoRA from Scratch Implementationsβ204Updated last year
- GPU Kernelsβ182Updated 2 months ago
- Pre-training code for Amber 7B LLMβ166Updated last year
- Lightweight demos for finetuning LLMs. Powered by π€ transformers and open-source datasets.β77Updated 8 months ago
- code for training & evaluating Contextual Document Embedding modelsβ195Updated last month
- LLaMA 2 implemented from scratch in PyTorchβ336Updated last year
- MultilingualSIFT: Multilingual Supervised Instruction Fine-tuningβ91Updated last year
- BERT explained from scratchβ14Updated last year
- A python package made to generate sequences (greedy and beam-search) from Pytorch (not necessarily HF transformers) models.β17Updated 3 weeks ago
- Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paperβ137Updated 11 months ago
- REST: Retrieval-Based Speculative Decoding, NAACL 2024β204Updated 6 months ago
- Well documented, unit tested, type checked and formatted implementation of a vanilla transformer - for educational purposes.β253Updated last year
- Easy and Efficient Quantization for Transformersβ199Updated 4 months ago
- Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedbackβ97Updated last year