tairov / llama2.pyLinks
Inference Llama 2 in one file of pure Python
☆426Updated 2 months ago
Alternatives and similar repositories for llama2.py
Users that are interested in llama2.py are comparing it to the libraries listed below
Sorting:
- Fine-tune mistral-7B on 3090s, a100s, h100s☆725Updated 2 years ago
- A bagel, with everything.☆326Updated last year
- ☆415Updated 2 years ago
- Inference code for Mistral and Mixtral hacked up into original Llama implementation☆371Updated 2 years ago
- ☆593Updated last year
- Python bindings for ggml☆147Updated last year
- This is our own implementation of 'Layer Selective Rank Reduction'☆240Updated last year
- batched loras☆349Updated 2 years ago
- ☆867Updated 2 years ago
- Ungreedy subword tokenizer and vocabulary trainer for Python, Go & Javascript☆616Updated last year
- Tune any FALCON in 4-bit☆463Updated 2 years ago
- The repository for the code of the UltraFastBERT paper☆518Updated last year
- Accelerate your Hugging Face Transformers 7.6-9x. Native to Hugging Face and PyTorch.☆685Updated last year
- ☆445Updated last year
- port of Andrjey Karpathy's llm.c to Mojo☆363Updated 6 months ago
- C++ implementation for 💫StarCoder☆459Updated 2 years ago
- a small code base for training large models☆322Updated 9 months ago
- Low-Rank adapter extraction for fine-tuned transformers models☆180Updated last year
- ☆166Updated 6 months ago
- Inference Llama 2 in one file of pure 🔥☆2,116Updated 2 months ago
- ☆472Updated 2 years ago
- Official implementation of Half-Quadratic Quantization (HQQ)☆912Updated last month
- Convenient wrapper for fine-tuning and inference of Large Language Models (LLMs) with several quantization techniques (GTPQ, bitsandbytes…☆146Updated 2 years ago
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆232Updated last year
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub☆161Updated 2 years ago
- Generate textbook-quality synthetic LLM pretraining data☆509Updated 2 years ago
- Inference code for Persimmon-8B☆412Updated 2 years ago
- Python bindings for llama.cpp☆198Updated 2 years ago
- Command-line script for inferencing from models such as MPT-7B-Chat☆100Updated 2 years ago
- Python bindings for the Transformer models implemented in C/C++ using GGML library.☆1,879Updated 2 years ago