kennethleungty / Llama-2-Open-Source-LLM-CPU-Inference
Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A
☆958Updated last year
Alternatives and similar repositories for Llama-2-Open-Source-LLM-CPU-Inference:
Users that are interested in Llama-2-Open-Source-LLM-CPU-Inference are comparing it to the libraries listed below
- Run inference on MPT-30B using CPU☆575Updated last year
- Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Use `llama2-wrapper` as your local llama2 backend…☆1,962Updated 10 months ago
- Python bindings for the Transformer models implemented in C/C++ using GGML library.☆1,830Updated last year
- Evaluation tool for LLM QA chains☆1,066Updated last year
- Open-source tool to visualise your RAG 🔮☆1,101Updated 3 weeks ago
- ggml implementation of BERT☆476Updated 11 months ago
- LLaMA v2 Chatbot☆1,400Updated last year
- A library of data loaders for LLMs made by the community -- to be used with LlamaIndex and/or LangChain☆3,464Updated 10 months ago
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆2,806Updated last year
- ⚡ Langchain apps in production using Jina & FastAPI☆1,617Updated last year
- CodeTF: One-stop Transformer Library for State-of-the-art Code LLM☆1,464Updated this week
- ☆756Updated last year
- 🤖 Everything you need to create an LLM Agent—tools, prompts, frameworks, and models—all in one place.☆1,761Updated 2 months ago
- ☆1,437Updated last year
- Fine-Tuning Embedding for RAG with Synthetic Data☆480Updated last year
- Chain together LLMs for reasoning & orchestrate multiple large models for accomplishing complex tasks☆599Updated last year
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆3,893Updated this week
- Agent techniques to augment your LLM and push it beyong its limits☆1,564Updated 8 months ago
- LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transform…☆1,448Updated last year
- Chat with your documents offline using AI.☆715Updated last year
- A collection of modular datasets generated by GPT-4, General-Instruct - Roleplay-Instruct - Code-Instruct - and Toolformer☆1,624Updated last year
- Fine-tune mistral-7B on 3090s, a100s, h100s☆706Updated last year
- Make Llama2 use Code Execution, Debug, Save Code, Reuse it, Access to Internet☆687Updated last year
- AutoChain: Build lightweight, extensible, and testable LLM Agents☆1,825Updated 8 months ago
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.☆4,650Updated last week
- A comprehensive guide to building RAG-based LLM applications for production.☆1,749Updated 5 months ago
- A school for camelids☆1,211Updated last year
- The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.☆36Updated last year
- A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for vario…☆990Updated 3 months ago
- prompt2model - Generate Deployable Models from Natural Language Instructions☆1,982Updated last month