MDK8888 / GPTFast
Accelerate your Hugging Face Transformers 7.6-9x. Native to Hugging Face and PyTorch.
☆683Updated 8 months ago
Alternatives and similar repositories for GPTFast:
Users that are interested in GPTFast are comparing it to the libraries listed below
- ☆959Updated 3 months ago
- Training LLMs with QLoRA + FSDP☆1,476Updated 5 months ago
- ☆444Updated last year
- Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.p…☆1,252Updated 2 weeks ago
- Reaching LLaMA2 Performance with 0.1M Dollars☆980Updated 9 months ago
- Automatically evaluate your LLMs in Google Colab☆620Updated 11 months ago
- A simple, performant and scalable Jax LLM!☆1,708Updated this week
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,378Updated last year
- ☆713Updated last month
- ☆529Updated 8 months ago
- Visualize the intermediate output of Mistral 7B☆360Updated 3 months ago
- Official implementation of Half-Quadratic Quantization (HQQ)☆800Updated this week
- Extend existing LLMs way beyond the original training length with constant memory usage, without retraining☆697Updated last year
- Llama-3 agents that can browse the web by following instructions and talking to you☆1,400Updated 4 months ago
- [ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling☆1,673Updated 9 months ago
- A bagel, with everything.☆320Updated last year
- The repository for the code of the UltraFastBERT paper☆517Updated last year
- ☆412Updated last year
- ☆210Updated 10 months ago
- Fine-tune mistral-7B on 3090s, a100s, h100s☆711Updated last year
- llama3.np is a pure NumPy implementation for Llama 3 model.☆981Updated last week
- ☆863Updated 7 months ago
- Stop messing around with finicky sampling parameters and just use DRµGS!☆349Updated 11 months ago
- The official implementation of Self-Play Fine-Tuning (SPIN)☆1,151Updated 11 months ago
- Train Models Contrastively in Pytorch☆700Updated last month
- This is our own implementation of 'Layer Selective Rank Reduction'☆237Updated 11 months ago
- LLM Analytics☆658Updated 6 months ago
- ☆706Updated last year
- Ship RAG based LLM web apps in seconds.☆990Updated last year
- An innovative library for efficient LLM inference via low-bit quantization☆350Updated 8 months ago