jondurbin / bagelLinks

A bagel, with everything.

☆324

Alternatives and similar repositories for bagel

Users that are interested in bagel are comparing it to the libraries listed below

Sorting:

Gryphe / BlockMerge_Gradient
Merge Transformers language models by use of gradient parameters.
☆208Updated last year
QuixiAI / laserRMT
This is our own implementation of 'Layer Selective Rank Reduction'
☆239Updated last year
sabetAI / BLoRA
batched loras
☆346Updated 2 years ago
imoneoi / multipack
Multipack distributed sampler for fast padding-free training of LLMs
☆201Updated last year
thomasgauthier / LoRD
Low-Rank adapter extraction for fine-tuned transformers models
☆177Updated last year
epfml / landmark-attention
Landmark Attention: Random-Access Infinite Context Length for Transformers
☆426Updated last year
huggingface / cosmopedia
☆544Updated 11 months ago
SkunkworksAI / hydra-moe
☆415Updated last year
jondurbin / qlora
QLoRA: Efficient Finetuning of Quantized LLMs
☆76Updated last year
huggingface / llm-swarm
Manage scalable open LLM inference endpoints in Slurm clusters
☆273Updated last year
apoorvumang / prompt-lookup-decoding
☆572Updated last year
VikParuchuri / textbook_quality
Generate textbook-quality synthetic LLM pretraining data
☆505Updated 2 years ago
tomaarsen / attention_sinks
Extend existing LLMs way beyond the original training length with constant memory usage, without retraining
☆722Updated last year
dzhulgakov / llama-mistral
Inference code for Mistral and Mixtral hacked up into original Llama implementation
☆368Updated last year
LLM360 / amber-train
Pre-training code for Amber 7B LLM
☆168Updated last year
persimmon-ai-labs / adept-inference
Inference code for Persimmon-8B
☆413Updated 2 years ago
Leeroo-AI / mergoo
A library for easily merging multiple LLM experts, and efficiently train the merged LLM.
☆493Updated last year
arcee-ai / PruneMe
Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models
☆249Updated last year
datamllab / LongLM
[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
☆662Updated last year
arcee-ai / EvolKit
EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…
☆240Updated 11 months ago
pratyushasharma / laser
The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
☆388Updated last year
FastEval / FastEval
Fast & more realistic evaluation of chat language models. Includes leaderboard.
☆189Updated last year
uukuguy / multi_loras
Load multiple LoRA modules simultaneously and automatically switch the appropriate combination of LoRA modules to generate the best answe…
☆158Updated last year
hydrallm / llama-moe-v1
☆96Updated 2 years ago
dwzhu-pku / PoSE
Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)
☆204Updated last year
keeeeenw / MicroLlama
Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget
☆161Updated 2 months ago
lm-sys / llm-decontaminator
Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"
☆311Updated last year
arielnlee / Platypus
Code for fine-tuning Platypus fam LLMs using LoRA
☆628Updated last year
Digitous / LLM-SLERP-Merge
Spherical Merge Pytorch/HF format Language Models with minimal feature loss.
☆138Updated 2 years ago
sdan / selfextend
an implementation of Self-Extend, to expand the context window via grouped attention
☆118Updated last year