broskicodes / slms
Experimenting with small language models
☆59Updated last year
Alternatives and similar repositories for slms:
Users that are interested in slms are comparing it to the libraries listed below
- Video+code lecture on building nanoGPT from scratch☆65Updated 7 months ago
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆225Updated 3 months ago
- 1.58-bit LLaMa model☆80Updated 9 months ago
- One click templates for inferencing Language Models☆146Updated 2 weeks ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆52Updated 11 months ago
- Various installation guides for Large Language Models☆62Updated 2 months ago
- ☆122Updated 5 months ago
- A fast batching API to serve LLM models☆180Updated 9 months ago
- ☆108Updated 5 months ago
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub☆156Updated last year
- run ollama & gguf easily with a single command☆49Updated 8 months ago
- ☆196Updated 8 months ago
- Building a 2.3M-parameter LLM from scratch with LLaMA 1 architecture.☆129Updated 8 months ago
- ☆52Updated last year
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆38Updated 8 months ago
- entropix style sampling + GUI☆25Updated 3 months ago
- Set of scripts to finetune LLMs☆36Updated 10 months ago
- The training notebooks that were similar to the original script used to train TinyMistral.☆19Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆48Updated 6 months ago
- ☆120Updated last week
- Easy to use, High Performant Knowledge Distillation for LLMs☆40Updated 3 weeks ago
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆154Updated 3 months ago
- Train your own small bitnet model☆64Updated 3 months ago
- LLM-Training-API: Including Embeddings & ReRankers, mergekit, LaserRMT☆26Updated 11 months ago
- Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs☆37Updated 3 months ago
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆190Updated 6 months ago
- Convenient wrapper for fine-tuning and inference of Large Language Models (LLMs) with several quantization techniques (GTPQ, bitsandbytes…☆146Updated last year
- ☆18Updated 10 months ago
- ☆65Updated 8 months ago
- Toolkit for attaching, training, saving and loading of new heads for transformer models☆260Updated last week