awinml / llama-cpp-python-bindings
Run fast LLM Inference using Llama.cpp in Python
☆17Updated last year
Alternatives and similar repositories for llama-cpp-python-bindings:
Users that are interested in llama-cpp-python-bindings are comparing it to the libraries listed below
- Function Calling Mistral 7B. Learn how to make functions call for open source LLMs.☆48Updated last year
- Zephyr 7B beta RAG Demo inside a Gradio app powered by BGE Embeddings, ChromaDB, and Zephyr 7B Beta LLM.☆34Updated last year
- ☆20Updated last year
- Finetune any model on HF in less than 30 seconds☆58Updated 3 weeks ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 9 months ago
- This Repo focuses on defending against 'adversarial prompts,' detecting and attempting to mitigate objectionable content in real time.☆13Updated last year
- Tutorial for DSPy☆23Updated 11 months ago
- A framework for high-fidelity retrieval augmented generation in industrial knowledge bases. Integrates jargon identification, context rec…☆30Updated 8 months ago
- A project that brings the power of Large Language Models (LLM) and Retrieval-Augmented Generation (RAG) within reach of everyone, particu…☆34Updated last year
- LLM-Training-API: Including Embeddings & ReRankers, mergekit, LaserRMT☆27Updated last year
- ☆20Updated last year
- 💙 Unstructured Data Connectors for Haystack 2.0☆16Updated last year
- Transform unstructured documents into actionable, structured data with enterprise-grade precision and reliability, ready for large-scale …☆19Updated this week
- Github repo for storing LlamaDatasets☆33Updated last year
- Simple Chainlit UI for running llms locally using Ollama and LangChain☆44Updated last year
- Metadata Enrichment using KeyBERT for advanced and improved RAG.☆10Updated last year
- Python Server for C3 AI app. A project that brings the power of Large Language Models (LLM) and Retrieval-Augmented Generation (RAG) with…☆23Updated last year
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆66Updated 5 months ago
- A code sample that shows how to use 🦜️🔗langchain, 🦙llama_index and a hosted LLM endpoint to do a standard chat or Q&A about a pdf doc…☆18Updated last year
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆105Updated 2 weeks ago
- ☆41Updated last year
- A script for merging a LLM model and a LoRA☆12Updated last year
- Data extraction with LLM on CPU☆113Updated last year
- Unsloth Fine Tuning☆10Updated last year
- Experimenting text-embeddings-inference server on both CPU and GPU☆18Updated last year
- Simple examples using Argilla tools to build AI☆52Updated 5 months ago
- Using multiple LLMs for ensemble Forecasting☆16Updated last year
- Set of scripts to finetune LLMs☆37Updated last year
- ☆45Updated last year
- GPT-4 Level Conversational QA Trained In a Few Hours☆60Updated 8 months ago