broskicodes / slmsLinks
Experimenting with small language models
☆71Updated last year
Alternatives and similar repositories for slms
Users that are interested in slms are comparing it to the libraries listed below
Sorting:
- Video+code lecture on building nanoGPT from scratch☆69Updated last year
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆232Updated 10 months ago
- Low-Rank adapter extraction for fine-tuned transformers models☆176Updated last year
- Train your own small bitnet model☆75Updated 10 months ago
- A little(lil) Language Model (LM). A tiny reproduction of LLaMA 3's model architecture.☆52Updated 4 months ago
- Various installation guides for Large Language Models☆74Updated 4 months ago
- ☆75Updated 11 months ago
- This is our own implementation of 'Layer Selective Rank Reduction'☆240Updated last year
- ☆134Updated last year
- Toolkit for attaching, training, saving and loading of new heads for transformer models☆286Updated 6 months ago
- Convenience scripts to finetune (chat-)LLaMa3 and other models for any language☆315Updated last year
- Dataset Crafting w/ RAG/Wikipedia ground truth and Efficient Fine-Tuning Using MLX and Unsloth. Includes configurable dataset annotation …☆185Updated last year
- a LLM cookbook, for building your own from scratch, all the way from gathering data to training a model☆155Updated last year
- ☆127Updated 5 months ago
- Testing LLM reasoning abilities with family relationship quizzes.☆63Updated 7 months ago
- ☆168Updated 2 years ago
- Set of scripts to finetune LLMs☆37Updated last year
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆41Updated last year
- ☆207Updated last year
- Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget☆161Updated last month
- Using open source LLMs to build synthetic datasets for direct preference optimization☆65Updated last year
- Fine-tuning LLMs using QLoRA☆263Updated last year
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆146Updated 6 months ago
- So, I trained a Llama a 130M architecture I coded from ground up to build a small instruct model from scratch. Trained on FineWeb dataset…☆15Updated 5 months ago
- Collection of autoregressive model implementation☆86Updated 4 months ago
- LLaMA 3 is one of the most promising open-source model after Mistral, we will recreate it's architecture in a simpler manner.☆184Updated last year
- ☆118Updated last year
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub☆161Updated last year
- ☆37Updated 8 months ago
- An unsupervised model merging algorithm for Transformers-based language models.☆107Updated last year