Manuel030 / alpaca-optLinks
Yet another LLM
☆10Updated 2 years ago
Alternatives and similar repositories for alpaca-opt
Users that are interested in alpaca-opt are comparing it to the libraries listed below
Sorting:
- ☆40Updated 2 years ago
- The Next Generation Multi-Modality Superintelligence☆70Updated last year
- ☆63Updated last year
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Updated 2 years ago
- ☆74Updated 2 years ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated last year
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models☆70Updated 2 years ago
- GPT-2 small trained on phi-like data☆68Updated last year
- SparseGPT + GPTQ Compression of LLMs like LLaMa, OPT, Pythia☆42Updated 2 years ago
- ☆15Updated 2 years ago
- QLoRA with Enhanced Multi GPU Support☆37Updated 2 years ago
- Fast approximate inference on a single GPU with sparsity aware offloading☆39Updated 2 years ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆34Updated last year
- Command-line script for inferencing from models such as MPT-7B-Chat☆100Updated 2 years ago
- Implementation of the Mamba SSM with hf_integration.☆55Updated last year
- The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER).…☆121Updated 2 years ago
- Advanced Ultra-Low Bitrate Compression Techniques for the LLaMA Family of LLMs☆110Updated 2 years ago
- ☆68Updated last year
- GPTQLoRA: Efficient Finetuning of Quantized LLMs with GPTQ☆101Updated 2 years ago
- Merge LLM that are split in to parts☆27Updated 6 months ago
- Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first app…☆169Updated 2 years ago
- entropix style sampling + GUI☆27Updated last year
- An all-new Language Model That Processes Ultra-Long Sequences of 100,000+ Ultra-Fast☆150Updated last year
- Full finetuning of large language models without large memory requirements☆94Updated 4 months ago
- Image Diffusion block merging technique applied to transformers based Language Models.☆56Updated 2 years ago
- ☆33Updated 2 years ago
- HuggingChat like UI in Gradio☆70Updated 2 years ago
- inference code for mixtral-8x7b-32kseqlen☆105Updated 2 years ago
- Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…☆15Updated 2 years ago
- ☆66Updated last week