bigscience-workshop / petals
πΈ Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
β9,432Updated 5 months ago
Alternatives and similar repositories for petals:
Users that are interested in petals are comparing it to the libraries listed below
- QLoRA: Efficient Finetuning of Quantized LLMsβ10,242Updated 8 months ago
- OpenLLaMA, a permissively licensed open source reproduction of Meta AIβs LLaMA 7B trained on the RedPajama datasetβ7,440Updated last year
- Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adβ¦β6,033Updated 5 months ago
- LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMathβ9,335Updated 6 months ago
- StableLM: Stability AI Language Modelsβ15,830Updated 10 months ago
- Universal LLM Deployment Engine with ML Compilationβ19,972Updated this week
- Python bindings for llama.cppβ8,621Updated 2 weeks ago
- Tensor library for machine learningβ11,857Updated this week
- Instruct-tune LLaMA on consumer hardwareβ18,803Updated 6 months ago
- Running large language models on a single GPU for throughput-oriented scenarios.β9,261Updated 3 months ago
- Large Language Model Text Generation Inferenceβ9,756Updated this week
- Locally run an Instruction-Tuned Chat-Style LLMβ10,236Updated last year
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.β37,775Updated this week
- Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.β10,587Updated this week
- A language for constraint-guided and efficient LLM programming.β3,808Updated 8 months ago
- [ICLR 2024] Efficient Streaming Language Models with Attention Sinksβ6,791Updated 7 months ago
- OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamicalβ¦β37,231Updated 6 months ago
- LlamaIndex is the leading framework for building LLM-powered agents over your data.β38,907Updated this week
- The simplest way to run LLaMA on your local machineβ13,089Updated 7 months ago
- SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 14+ clouds). Get unified execution, cost savings, and high GPU availability vβ¦β7,173Updated this week
- CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.β4,998Updated 2 weeks ago
- Code and documentation to train Stanford's Alpaca models, and generate the data.β29,815Updated 7 months ago
- The RedPajama-Data repository contains code for preparing large datasets for training large language models.β4,643Updated 2 months ago
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.β2,822Updated last year
- the AI-native open-source embedding databaseβ17,686Updated this week
- A Gradio web UI for Large Language Models with support for multiple inference backends.β42,453Updated this week
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.β8,201Updated 9 months ago
- A Bulletproof Way to Generate Structured JSON from Language Modelsβ4,569Updated 11 months ago
- Semantic cache for LLMs. Fully integrated with LangChain and llama_index.β7,380Updated 4 months ago
- Simple UI for LLM Model Finetuningβ2,052Updated last year