kennethleungty / Llama-2-Open-Source-LLM-CPU-InferenceLinks
Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A
☆964Updated last year
Alternatives and similar repositories for Llama-2-Open-Source-LLM-CPU-Inference
Users that are interested in Llama-2-Open-Source-LLM-CPU-Inference are comparing it to the libraries listed below
Sorting:
- ☆1,486Updated last year
- Open-source tool to visualise your RAG 🔮☆1,136Updated 5 months ago
- Run inference on MPT-30B using CPU☆575Updated last year
- LLaMA v2 Chatbot☆1,411Updated last year
- Evaluation tool for LLM QA chains☆1,073Updated 2 years ago
- Ship RAG based LLM web apps in seconds.☆995Updated last year
- This repository provides very basic flask, streamlit, and docker examples for the llama_index (fka gpt_index) package☆624Updated 10 months ago
- Python bindings for the Transformer models implemented in C/C++ using GGML library.☆1,868Updated last year
- 🤖 Deploy a private ChatGPT alternative hosted within your VPC. 🔮 Connect it to your organization's knowledge base and use it as a corpo…☆1,498Updated last year
- A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for vario…☆1,016Updated 4 months ago
- Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Use `llama2-wrapper` as your local llama2 backend…☆1,957Updated last year
- A comprehensive guide to building RAG-based LLM applications for production.☆1,798Updated 10 months ago
- Official supported Python bindings for llama.cpp + gpt4all☆1,019Updated 2 years ago
- Open-Source Implementation of WizardLM to turn documents into Q:A pairs for LLM fine-tuning☆308Updated 8 months ago
- Simple UI for LLM Model Finetuning☆2,063Updated last year
- The web framework for building LLM microservices☆993Updated 11 months ago
- ☆769Updated this week
- RayLLM - LLMs on Ray (Archived). Read README for more info.☆1,260Updated 3 months ago
- LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transform…☆1,458Updated last year
- A library of data loaders for LLMs made by the community -- to be used with LlamaIndex and/or LangChain☆3,476Updated last year
- A collection of modular datasets generated by GPT-4, General-Instruct - Roleplay-Instruct - Code-Instruct - and Toolformer☆1,635Updated last year
- Scale LLM Engine public repository☆804Updated this week
- Agent techniques to augment your LLM and push it beyong its limits☆1,575Updated last year
- Finetuning Large Language Models on One Consumer GPU in 2 Bits☆724Updated last year
- ⛓️ Serving LangChain LLM apps and agents automagically with FastApi. LLMops☆926Updated 11 months ago
- Build robust LLM applications with true composability 🔗☆419Updated last year
- Tune any FALCON in 4-bit☆467Updated last year
- Chat language model that can use tools and interpret the results☆1,563Updated 2 weeks ago
- LOMO: LOw-Memory Optimization☆985Updated 11 months ago
- ⚡ Langchain apps in production using Jina & FastAPI☆1,631Updated last year