zjersey / Lightseq-ARM
☆30Updated last year
Alternatives and similar repositories for Lightseq-ARM:
Users that are interested in Lightseq-ARM are comparing it to the libraries listed below
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Updated last year
- ☆32Updated last year
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆39Updated 9 months ago
- Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…☆14Updated last year
- GPTQLoRA: Efficient Finetuning of Quantized LLMs with GPTQ☆99Updated last year
- Spherical Merge Pytorch/HF format Language Models with minimal feature loss.☆115Updated last year
- Finetune any model on HF in less than 30 seconds☆58Updated last month
- Implementation of the Mamba SSM with hf_integration.☆56Updated 5 months ago
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data☆21Updated 6 months ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated 11 months ago
- A repository for research on medium sized language models.☆76Updated 9 months ago
- Fast approximate inference on a single GPU with sparsity aware offloading☆38Updated last year
- Sakura-SOLAR-DPO: Merge, SFT, and DPO☆116Updated last year
- QLoRA: Efficient Finetuning of Quantized LLMs☆77Updated 10 months ago
- The Next Generation Multi-Modality Superintelligence☆71Updated 5 months ago
- ☆31Updated last year
- ☆62Updated 7 months ago
- SparseGPT + GPTQ Compression of LLMs like LLaMa, OPT, Pythia☆41Updated last year
- manage histories of LLM applied applications☆88Updated last year
- RWKV-7: Surpassing GPT☆79Updated 3 months ago
- ☆74Updated last year
- The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER).…☆121Updated last year
- Zeta implementation of a reusable and plug in and play feedforward from the paper "Exponentially Faster Language Modeling"☆15Updated 3 months ago
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆96Updated 4 months ago
- ☆24Updated last year
- ☆26Updated last year
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆115Updated last year
- ☆26Updated 11 months ago
- ☆53Updated 8 months ago