chrisociepa / allamoLinks

Simple, hackable and fast implementation for training/finetuning medium-sized LLaMA-based models

☆182

Alternatives and similar repositories for allamo

Users that are interested in allamo are comparing it to the libraries listed below

Sorting:

johnsmith0031 / alpaca_lora_4bit
☆534Updated last year
alasdairforsythe / tokenmonster
Ungreedy subword tokenizer and vocabulary trainer for Python, Go & Javascript
☆603Updated last year
rmihaylov / falcontune
Tune any FALCON in 4-bit
☆464Updated 2 years ago
eugenepentland / landmark-attention-qlora
Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRA
☆123Updated 2 years ago
cmp-nct / ggllm.cpp
Falcon LLM ggml framework with CPU and GPU support
☆247Updated last year
jondurbin / bagel
A bagel, with everything.
☆324Updated last year
jondurbin / qlora
QLoRA: Efficient Finetuning of Quantized LLMs
☆76Updated last year
QuixiAI / laserRMT
This is our own implementation of 'Layer Selective Rank Reduction'
☆239Updated last year
thomasgauthier / LoRD
Low-Rank adapter extraction for fine-tuned transformers models
☆177Updated last year
Gryphe / BlockMerge_Gradient
Merge Transformers language models by use of gradient parameters.
☆208Updated last year
TheBlokeAI / AIScripts
Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub
☆160Updated 2 years ago
rmihaylov / mpttune
Tune MPTs
☆84Updated 2 years ago
zphang / minimal-llama
☆457Updated 2 years ago
taprosoft / llm_finetuning
Convenient wrapper for fine-tuning and inference of Large Language Models (LLMs) with several quantization techniques (GTPQ, bitsandbytes…
☆146Updated 2 years ago
declare-lab / flan-alpaca
This repository contains code for extending the Stanford Alpaca synthetic instruction tuning to existing instruction-tuned models such as…
☆356Updated 2 years ago
NolanoOrg / cformers
SoTA Transformers with C-backend for fast inference on your CPU.
☆308Updated last year
PotatoSpudowski / fastLLaMa
fastLLaMa: An experimental high-performance framework for running Decoder-only LLMs with 4-bit quantization in Python using a C/C++ backe…
☆412Updated 2 years ago
epfml / landmark-attention
Landmark Attention: Random-Access Infinite Context Length for Transformers
☆426Updated last year
mzbac / qlora-fine-tune
☆166Updated 2 years ago
pbelcak / UltraFastBERT
The repository for the code of the UltraFastBERT paper
☆518Updated last year
kuutsav / llm-toys
Small finetuned LLMs for a diverse set of useful tasks
☆127Updated 2 years ago
SkunkworksAI / hydra-moe
☆415Updated last year
iwalton3 / mpt-lora-patch
Patch for MPT-7B which allows using and training a LoRA
☆58Updated 2 years ago
togethercomputer / redpajama.cpp
Extend the original llama.cpp repo to support redpajama model.
☆118Updated last year
skeskinen / bert.cpp
ggml implementation of BERT
☆492Updated last year
thomasantony / llamacpp-python
Python bindings for llama.cpp
☆198Updated 2 years ago
sabetAI / BLoRA
batched loras
☆346Updated 2 years ago
uukuguy / multi_loras
Load multiple LoRA modules simultaneously and automatically switch the appropriate combination of LoRA modules to generate the best answe…
☆158Updated last year
aigoopy / llm-jeopardy
Automated prompting and scoring framework to evaluate LLMs using updated human knowledge prompts
☆108Updated 2 years ago
harrisonvanderbyl / rwkvstic
Framework agnostic python runtime for RWKV models
☆145Updated 2 years ago