knotgrass / How-Transformers-Work
π§ A study guide to learn about Transformers
β10Updated last year
Alternatives and similar repositories for How-Transformers-Work:
Users that are interested in How-Transformers-Work are comparing it to the libraries listed below
- Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedbackβ92Updated last year
- Fine-tuning Open-Source LLMs for Adaptive Machine Translationβ69Updated last month
- This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog poβ¦β87Updated last year
- Tutorial for how to build BERT from scratchβ86Updated 7 months ago
- β76Updated 3 months ago
- [ACL 2024 Demo] SeaLLMs - Large Language Models for Southeast Asiaβ155Updated 5 months ago
- This repository contains the code for dataset curation and finetuning of instruct variant of the Bilingual OpenHathi model. The resultinβ¦β23Updated last year
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language Mβ¦β198Updated 2 months ago
- Distillation Contrastive Decoding: Improving LLMs Reasoning with Contrastive Decoding and Distillationβ33Updated 10 months ago
- This repository contains an implementation of the LLaMA 2 (Large Language Model Meta AI) model, a Generative Pretrained Transformer (GPT)β¦β56Updated last year
- Prune transformer layersβ67Updated 7 months ago
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Dayβ253Updated last year
- A set of scripts and notebooks on LLM finetunning and dataset creationβ99Updated 3 months ago
- experiments with inference on llamaβ104Updated 7 months ago
- Fine-tune ModernBERT on a large Dataset with Custom Tokenizer Trainingβ52Updated 3 weeks ago
- Lightweight demos for finetuning LLMs. Powered by π€ transformers and open-source datasets.β66Updated 2 months ago
- A Multilingual Replicable Instruction-Following Modelβ94Updated last year
- Efficient Attention for Long Sequence Processingβ91Updated last year
- Learn CUDA with PyTorchβ14Updated 2 months ago
- Due to restriction of LLaMA, we try to reimplement BLOOM-LoRA (much less restricted BLOOM license here https://huggingface.co/spaces/bigsβ¦β185Updated last year
- SandLogic Lexiconsβ16Updated 3 months ago
- β16Updated 2 weeks ago
- Training and Fine-tuning an llm in Python and PyTorch.β41Updated last year
- MultilingualSIFT: Multilingual Supervised Instruction Fine-tuningβ89Updated last year
- Distributed training (multi-node) of a Transformer modelβ49Updated 9 months ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.β34Updated last month
- LLaMA 2 implemented from scratch in PyTorchβ280Updated last year
- Complete implementation of Llama2 with/without KV cache & inference πβ47Updated 7 months ago
- Recreating PyTorch from scratch (C/C++, CUDA, NCCL and Python, with multi-GPU support and automatic differentiation!)β138Updated 7 months ago
- A minimum example of aligning language models with RLHF similar to ChatGPTβ215Updated last year