TheSeriousProgrammer / SimpleBitNetLinks
Simple Adaptation of BitNet
☆32Updated last year
Alternatives and similar repositories for SimpleBitNet
Users that are interested in SimpleBitNet are comparing it to the libraries listed below
Sorting:
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆154Updated last year
- LoRA and DoRA from Scratch Implementations☆211Updated last year
- Training small GPT-2 style models using Kolmogorov-Arnold networks.☆121Updated last year
- LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch☆117Updated 2 years ago
- A repository for log-time feedforward networks☆222Updated last year
- This repository contains an implementation of the LLaMA 2 (Large Language Model Meta AI) model, a Generative Pretrained Transformer (GPT)…☆72Updated 2 years ago
- The repository for the code of the UltraFastBERT paper☆518Updated last year
- Fine-Tuning Llama3-8B LLM in a multi-GPU environment using DeepSpeed☆19Updated last year
- Code repository for Black Mamba☆258Updated last year
- ☆69Updated last year
- Toolkit for attaching, training, saving and loading of new heads for transformer models☆289Updated 7 months ago
- Official PyTorch implementation of QA-LoRA☆141Updated last year
- Huggingface compatible implementation of RetNet (Retentive Networks, https://arxiv.org/pdf/2307.08621.pdf) including parallel, recurrent,…☆226Updated last year
- Prune transformer layers☆69Updated last year
- Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates☆466Updated last year
- Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 lines☆195Updated last year
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆231Updated last year
- A numpy implementation of the Transformer model in "Attention is All You Need"☆58Updated last year
- Tutorial for how to build BERT from scratch☆99Updated last year
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆202Updated last year
- Notes on quantization in neural networks☆104Updated last year
- Implementation of DoRA☆304Updated last year
- Complete implementation of Llama2 with/without KV cache & inference 🚀☆48Updated last year
- Experimenting with small language models☆74Updated last year
- Collection of autoregressive model implementation☆86Updated 6 months ago
- Place where folks can contribute to 🤗 community events☆426Updated last year
- GPTQLoRA: Efficient Finetuning of Quantized LLMs with GPTQ☆102Updated 2 years ago
- Annotated version of the Mamba paper☆490Updated last year
- Rebuild the Stable Diffusion Model in a single python script. Tutorial for Harvard ML from Scratch Series☆217Updated 9 months ago
- An implementation of the base GPT-3 Model architecture from the paper by OPENAI "Language Models are Few-Shot Learners"☆18Updated last year