venuatu / llama
Inference code for LLaMA models
☆45Updated last year
Related projects ⓘ
Alternatives and complementary repositories for llama
- Inference code for facebook LLaMA models with Wrapyfi support☆130Updated last year
- Just a simple HowTo for https://github.com/johnsmith0031/alpaca_lora_4bit☆31Updated last year
- Inference code for LLaMA models☆35Updated last year
- Inference code for LLaMA models with Gradio Interface and rolling generation like ChatGPT☆48Updated last year
- Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRA☆124Updated last year
- Train llama with lora on one 4090 and merge weight of lora to work as stanford alpaca.☆50Updated last year
- Automated prompting and scoring framework to evaluate LLMs using updated human knowledge prompts☆111Updated last year
- 4 bits quantization of SantaCoder using GPTQ☆53Updated last year
- The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER).…☆122Updated last year
- Framework agnostic python runtime for RWKV models☆145Updated last year
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆66Updated last year
- Conversational Language model toolkit for training against human preferences.☆41Updated 7 months ago
- Instruct-tuning LLaMA on consumer hardware☆66Updated last year
- ☆26Updated last year
- Extend the original llama.cpp repo to support redpajama model.☆117Updated 2 months ago
- ChatGPT-like Web UI for RWKVstic☆100Updated last year
- Tune MPTs☆84Updated last year
- Inference code for LLaMA 2 models☆31Updated 4 months ago
- This project aims to make RWKV Accessible to everyone using a Hugging Face like interface, while keeping it close to the R and D RWKV bra…☆64Updated last year
- SoTA Transformers with C-backend for fast inference on your CPU.☆311Updated 11 months ago
- RWKV infctx trainer, for training arbitary context sizes, to 10k and beyond!☆133Updated 3 months ago
- Code for the paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot" with LLaMA implementation.☆70Updated last year
- A finetuning pipeline for instruct tuning Raven 14bn using QLORA 4bit and the Ditty finetuning library☆28Updated 5 months ago
- Train Llama Loras Easily☆29Updated last year
- Embeddings focused small version of Llama NLP model☆102Updated last year
- ☆40Updated last year
- rwkv_chatbot☆62Updated last year
- LLM family chart☆51Updated last year