kennethleungty / Llama-2-Open-Source-LLM-CPU-Inference
Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A
☆962Updated last year
Alternatives and similar repositories for Llama-2-Open-Source-LLM-CPU-Inference
Users that are interested in Llama-2-Open-Source-LLM-CPU-Inference are comparing it to the libraries listed below
Sorting:
- ⚡ Langchain apps in production using Jina & FastAPI☆1,631Updated last year
- LLaMA v2 Chatbot☆1,408Updated last year
- 🤖 Deploy a private ChatGPT alternative hosted within your VPC. 🔮 Connect it to your organization's knowledge base and use it as a corpo…☆1,494Updated last year
- Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Use `llama2-wrapper` as your local llama2 backend…☆1,959Updated last year
- Evaluation tool for LLM QA chains☆1,073Updated 2 years ago
- ☆1,475Updated last year
- Locally hosted tool that connects documents to LLMs for summarization and querying, with a simple GUI.☆791Updated last year
- Agent techniques to augment your LLM and push it beyong its limits☆1,576Updated 11 months ago
- ⛓️ Serving LangChain LLM apps and agents automagically with FastApi. LLMops☆927Updated 10 months ago
- CodeTF: One-stop Transformer Library for State-of-the-art Code LLM☆1,477Updated 2 weeks ago
- A library of data loaders for LLMs made by the community -- to be used with LlamaIndex and/or LangChain☆3,478Updated last year
- LLM(😽)☆1,667Updated 3 months ago
- RayLLM - LLMs on Ray (Archived). Read README for more info.☆1,261Updated 2 months ago
- Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"☆704Updated last year
- Ship RAG based LLM web apps in seconds.☆991Updated last year
- The web framework for building LLM microservices☆990Updated 10 months ago
- Open-source tool to visualise your RAG 🔮☆1,128Updated 4 months ago
- The Official Python Client for Lamini's API☆2,534Updated last month
- Run inference on MPT-30B using CPU☆575Updated last year
- 💬 RasaGPT is the first headless LLM chatbot platform built on top of Rasa and Langchain. Built w/ Rasa, FastAPI, Langchain, LlamaIndex, …☆2,422Updated last year
- H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://docs.h2o.ai/h2o-llmstudio/☆4,301Updated last month
- Python bindings for the Transformer models implemented in C/C++ using GGML library.☆1,864Updated last year
- LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions☆820Updated 2 years ago
- [ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings☆1,949Updated 4 months ago
- LLM as a Chatbot Service☆3,320Updated last year
- LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transform…☆1,457Updated last year
- This repository provides very basic flask, streamlit, and docker examples for the llama_index (fka gpt_index) package☆622Updated 8 months ago
- ☆1,026Updated last year
- kani (カニ) is a highly hackable microframework for chat-based language models with tool use/function calling. (NLP-OSS @ EMNLP 2023)☆576Updated last month
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆2,873Updated last year