VatsaDev / nanoChatGPT
nanogpt turned into a chat model
☆68Updated last year
Alternatives and similar repositories for nanoChatGPT
Users that are interested in nanoChatGPT are comparing it to the libraries listed below
Sorting:
- Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget☆150Updated last year
- QLoRA: Efficient Finetuning of Quantized LLMs☆77Updated last year
- Video+code lecture on building nanoGPT from scratch☆67Updated 11 months ago
- QLoRA with Enhanced Multi GPU Support☆37Updated last year
- Low-Rank adapter extraction for fine-tuned transformers models☆173Updated last year
- Convenient wrapper for fine-tuning and inference of Large Language Models (LLMs) with several quantization techniques (GTPQ, bitsandbytes…☆147Updated last year
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆231Updated 6 months ago
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub☆162Updated last year
- Train your own small bitnet model☆70Updated 6 months ago
- ☆87Updated last year
- GPTQLoRA: Efficient Finetuning of Quantized LLMs with GPTQ☆102Updated last year
- The training notebooks that were similar to the original script used to train TinyMistral.☆21Updated last year
- Implementation of the Mamba SSM with hf_integration.☆56Updated 8 months ago
- Tune MPTs☆84Updated last year
- inference code for mixtral-8x7b-32kseqlen☆100Updated last year
- Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first app…☆168Updated last year
- Training and Fine-tuning an llm in Python and PyTorch.☆41Updated last year
- Combining ViT and GPT-2 for image captioning. Trained on MS-COCO. The model was implemented mostly from scratch.☆42Updated last year
- Merge Transformers language models by use of gradient parameters.☆208Updated 9 months ago
- ☆43Updated 3 months ago
- Instruct-tune Open LLaMA / RedPajama / StableLM models on consumer hardware using QLoRA☆81Updated last year
- LLM-Training-API: Including Embeddings & ReRankers, mergekit, LaserRMT☆27Updated last year
- ☆52Updated 3 months ago
- Multi-Domain Expert Learning☆67Updated last year
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆199Updated 10 months ago
- A set of scripts and notebooks on LLM finetunning and dataset creation☆109Updated 7 months ago
- Reimplementation of the task generation part from the Alpaca paper☆119Updated 2 years ago
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆154Updated 7 months ago
- My fork os allen AI's OLMo for educational purposes.☆30Updated 5 months ago
- entropix style sampling + GUI☆26Updated 6 months ago