rmihaylov/falcontune

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/rmihaylov/falcontune)

rmihaylov / falcontune

Tune any FALCON in 4-bit

☆462

Alternatives and similar repositories for falcontune

Users that are interested in falcontune are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

rmihaylov / mpttune
View on GitHub
Tune MPTs
☆84Jun 17, 2023Updated 3 years ago
turboderp / exllama
View on GitHub
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
☆2,934Sep 30, 2023Updated 2 years ago
leehanchung / SMILE-factory
View on GitHub
Finetune Falcon, LLaMA, MPT, and RedPajama on consumer hardware using PEFT LoRA
☆106Updated this week
artidoro / qlora
View on GitHub
QLoRA: Efficient Finetuning of Quantized LLMs
☆10,968Jun 10, 2024Updated 2 years ago
eugenepentland / landmark-attention-qlora
View on GitHub
Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRA
☆123Jun 16, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
jondurbin / airoboros
View on GitHub
Customizable implementation of the self-instruct paper.
☆1,051Mar 7, 2024Updated 2 years ago
kaiokendev / cutoff-len-is-context-len
View on GitHub
Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit
☆62Jun 21, 2023Updated 3 years ago
mosaicml / llm-foundry
View on GitHub
LLM training code for Databricks foundation models
☆4,431Mar 25, 2026Updated 4 months ago
axolotl-ai-cloud / axolotl
View on GitHub
Go ahead and axolotl questions
☆12,247Updated this week
epfml / landmark-attention
View on GitHub
Landmark Attention: Random-Access Infinite Context Length for Transformers
☆426Dec 20, 2023Updated 2 years ago
mzbac / qlora-inference-multi-gpu
View on GitHub
☆14May 25, 2023Updated 3 years ago
openlm-research / open_llama
View on GitHub
OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
☆7,533Jul 16, 2023Updated 3 years ago
qwopqwop200 / GPTQ-for-LLaMa
View on GitHub
4 bits quantization of LLaMA using GPTQ
☆3,072Jul 13, 2024Updated 2 years ago
huggingface / text-generation-inference
View on GitHub
Large Language Model Text Generation Inference
☆10,882Mar 21, 2026Updated 4 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
nlpxucan / WizardLM
View on GitHub
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
☆9,480Jun 7, 2025Updated last year
kuleshov-group / llmtools
View on GitHub
Finetuning Large Language Models on One Consumer GPU in 2 Bits
☆732May 25, 2024Updated 2 years ago
deep-diver / LLM-As-Chatbot
View on GitHub
LLM as a Chatbot Service
☆3,320Nov 20, 2023Updated 2 years ago
Lightning-AI / lit-llama
View on GitHub
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Ad…
☆6,083Jul 1, 2025Updated last year
jquesnelle / yarn
View on GitHub
YaRN: Efficient Context Window Extension of Large Language Models
☆1,740Apr 17, 2024Updated 2 years ago
cmp-nct / ggllm.cpp
View on GitHub
Falcon LLM ggml framework with CPU and GPU support
☆250Jul 2, 2026Updated 3 weeks ago
Dhaladom / TALIS
View on GitHub
Simple and fast server for GPTQ-quantized LLaMA inference
☆24May 18, 2023Updated 3 years ago
Lightning-AI / litgpt
View on GitHub
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
☆13,525Updated this week
AutoGPTQ / AutoGPTQ
View on GitHub
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
☆5,075Apr 11, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
hyperonym / basaran
View on GitHub
Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Tra…
☆1,283Jan 24, 2024Updated 2 years ago
taprosoft / llm_finetuning
View on GitHub
Convenient wrapper for fine-tuning and inference of Large Language Models (LLMs) with several quantization techniques (GTPQ, bitsandbytes…
☆143Oct 17, 2023Updated 2 years ago
turboderp-org / exllamav2
View on GitHub
A fast inference library for running LLMs locally on modern consumer-class GPUs
☆4,593Mar 4, 2026Updated 4 months ago
CStanKonrad / long_llama
View on GitHub
LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transform…
☆1,465Nov 7, 2023Updated 2 years ago
mzbac / gptq-cuda-api
View on GitHub
☆21May 27, 2023Updated 3 years ago
Vahe1994 / SpQR
View on GitHub
☆554Feb 8, 2026Updated 5 months ago
huggingface / alignment-handbook
View on GitHub
Robust recipes to align language models with human and AI preferences
☆5,645May 26, 2026Updated 2 months ago
menloparklab / falcon-langchain
View on GitHub
Falcon LLM with Chat UI using LangChain and Chainlit
☆169Jul 30, 2023Updated 2 years ago
tloen / alpaca-lora
View on GitHub
Instruct-tune LLaMA on consumer hardware
☆18,913Jul 29, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
OpenLMLab / LOMO
View on GitHub
LOMO: LOw-Memory Optimization
☆994Jul 2, 2024Updated 2 years ago
Victorwz / LongMem
View on GitHub
Official implementation of our NeurIPS 2023 paper "Augmenting Language Models with Long-Term Memory".
☆827Mar 30, 2024Updated 2 years ago
xlang-ai / instructor-embedding
View on GitHub
[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
☆2,024Jan 15, 2025Updated last year
salesforce / xgen
View on GitHub
Salesforce open-source LLMs with 8k sequence length.
☆727Jun 2, 2026Updated last month
abacaj / fine-tune-mistral
View on GitHub
Fine-tune mistral-7B on 3090s, a100s, h100s
☆735Oct 11, 2023Updated 2 years ago
Birch-san / falcon-play
View on GitHub
Command-line script for inferencing from models such as falcon-7b-instruct
☆75Jun 1, 2023Updated 3 years ago
mit-han-lab / streaming-llm
View on GitHub
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
☆7,249Jul 11, 2024Updated 2 years ago