MDK8888 / GPTFast
Accelerate your Hugging Face Transformers 7.6-9x. Native to Hugging Face and PyTorch.
☆688Updated 5 months ago
Alternatives and similar repositories for GPTFast:
Users that are interested in GPTFast are comparing it to the libraries listed below
- ☆930Updated 2 weeks ago
- Training LLMs with QLoRA + FSDP☆1,451Updated 3 months ago
- Reaching LLaMA2 Performance with 0.1M Dollars☆973Updated 6 months ago
- ☆446Updated 10 months ago
- Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.p…☆1,215Updated last month
- The official implementation of Self-Play Fine-Tuning (SPIN)☆1,118Updated 9 months ago
- Fine-tune mistral-7B on 3090s, a100s, h100s☆705Updated last year
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,361Updated 10 months ago
- Extend existing LLMs way beyond the original training length with constant memory usage, without retraining☆687Updated 10 months ago
- Official implementation of Half-Quadratic Quantization (HQQ)☆748Updated this week
- A simple, performant and scalable Jax LLM!☆1,624Updated this week
- Train Models Contrastively in Pytorch☆643Updated this week
- Automatically evaluate your LLMs in Google Colab☆592Updated 9 months ago
- ☆502Updated 5 months ago
- ☆810Updated 5 months ago
- A bagel, with everything.☆316Updated 10 months ago
- [ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling☆1,612Updated 7 months ago
- Serving multiple LoRA finetuned LLM as one☆1,029Updated 9 months ago
- Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors a…☆1,283Updated this week
- Minimalistic large language model 3D-parallelism training☆1,483Updated this week
- The repository for the code of the UltraFastBERT paper☆517Updated 10 months ago
- Visualize the intermediate output of Mistral 7B☆339Updated 3 weeks ago
- ☆679Updated 2 weeks ago
- Large-scale LLM inference engine☆1,295Updated this week
- llama3.np is a pure NumPy implementation for Llama 3 model.☆973Updated 8 months ago
- Inference code for Persimmon-8B☆416Updated last year
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆230Updated 3 months ago
- ☆412Updated last year
- data cleaning and curation for unstructured text☆329Updated 6 months ago
- Llama-3 agents that can browse the web by following instructions and talking to you☆1,388Updated 2 months ago