awinml / llama-cpp-python-bindingsLinks

Run fast LLM Inference using Llama.cpp in Python

☆17

Alternatives and similar repositories for llama-cpp-python-bindings

Users that are interested in llama-cpp-python-bindings are comparing it to the libraries listed below

Sorting:

AIAnytime / Zephyr-7B-beta-RAG-Demo
Zephyr 7B beta RAG Demo inside a Gradio app powered by BGE Embeddings, ChromaDB, and Zephyr 7B Beta LLM.
☆35Updated last year
Xalp / ECHO
Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)
☆91Updated 5 months ago
edumunozsala / llama-2-7B-4bit-python-coder
Fine-tune and quantize Llama-2-like models to generate Python code using QLoRA, Axolot,..
☆64Updated last year
AIAnytime / agent-watch
Agent Watch is an AgentOps monitoring library designed for Crew AI applications.
☆19Updated 7 months ago
githubpradeep / notebooks
☆54Updated 5 months ago
deshwalmahesh / PHUDGE
Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…
☆49Updated last year
Cerebras / DocChat
GPT-4 Level Conversational QA Trained In a Few Hours
☆63Updated 10 months ago
yip-kl / llm_dspy_tutorial
Tutorial for DSPy
☆23Updated last year
weaviate / structured-rag
Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models
☆108Updated 3 months ago
geronimi73 / phi2-finetune
☆87Updated last year
Doriandarko / OraclesGPT
☆11Updated last year
louisbrulenaudet / ragoon
High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡
☆66Updated 8 months ago
iulia-b10 / multilingual-embedding-models
☆20Updated last year
miralab-ai / autoreason
☆40Updated 7 months ago
sachinsharma9780 / chatbot_with_ChatGPT_KnowledgeGraph_ArangodB
Building Knowledge Graph-Driven Chatbot with ChatGPT and ArangoDB
☆20Updated last year
ElleLeonne / Lightning-ReLoRA
A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.
☆33Updated last year
ianhohoho / auto-hyde
🔎 A deep-dive into HyDE for Advanced LLM RAG + 💡 Introducing AutoHyDE, a semi-supervised framework to improve the effectiveness, covera…
☆32Updated last year
Ashufet / Superior-RAG-for-Complex-PDFs-using-LlamaParse
I have explained how to create superior RAG pipeline for complex pdfs using LlamaParse. We can extract text and tables from pdf and QA on…
☆46Updated last year
samchaineau / llm_slerp_generation
Repo hosting codes and materials related to speeding LLMs' inference using token merging.
☆36Updated last year
CVxTz / llm-serve-tutorial
☆20Updated last year
Ashufet / Complex-PDF-RAG-Agent-using-QueryPipeline-from-Scratch_Llamaparse_OS-LLM
☆42Updated last year
AIAnytime / Function-Calling-Mistral-7B
Function Calling Mistral 7B. Learn how to make functions call for open source LLMs.
☆48Updated last year
zrizvi93 / trevorhack
☆45Updated last year
AIAnytime / Synthetic-Data-Generation-using-LLM
Synthetic Data Generation using LLM via Argilla, Distilabel, ChatGPT, etc.
☆30Updated last year
azharlabs / large-models
☆16Updated last year
AI-ANK / Airbnb-Listing-Explorer
☆29Updated last year
mickymultani / RAG-with-Cross-Encoder-Reranker
Testing speed and accuracy of RAG with, and without Cross Encoder Reranker.
☆48Updated last year
shivamsanju / ragswift
🚀 Scale your RAG pipeline using Ragswift: A scalable centralized embeddings management platform
☆38Updated last year
amogkam / llama_index_ray
Using LlamaIndex with Ray for productionizing LLM applications
☆71Updated last year
kyegomez / Finetuning-Suite
Finetune any model on HF in less than 30 seconds
☆57Updated 3 months ago