awinml / llama-cpp-python-bindingsLinks
Run fast LLM Inference using Llama.cpp in Python
☆17Updated last year
Alternatives and similar repositories for llama-cpp-python-bindings
Users that are interested in llama-cpp-python-bindings are comparing it to the libraries listed below
Sorting:
- Zephyr 7B beta RAG Demo inside a Gradio app powered by BGE Embeddings, ChromaDB, and Zephyr 7B Beta LLM.☆35Updated last year
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆91Updated 5 months ago
- Fine-tune and quantize Llama-2-like models to generate Python code using QLoRA, Axolot,..☆64Updated last year
- Agent Watch is an AgentOps monitoring library designed for Crew AI applications.☆19Updated 7 months ago
- ☆54Updated 5 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated last year
- GPT-4 Level Conversational QA Trained In a Few Hours☆63Updated 10 months ago
- Tutorial for DSPy☆23Updated last year
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆108Updated 3 months ago
- ☆87Updated last year
- ☆11Updated last year
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆66Updated 8 months ago
- ☆20Updated last year
- ☆40Updated 7 months ago
- Building Knowledge Graph-Driven Chatbot with ChatGPT and ArangoDB☆20Updated last year
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year
- 🔎 A deep-dive into HyDE for Advanced LLM RAG + 💡 Introducing AutoHyDE, a semi-supervised framework to improve the effectiveness, covera…☆32Updated last year
- I have explained how to create superior RAG pipeline for complex pdfs using LlamaParse. We can extract text and tables from pdf and QA on…☆46Updated last year
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆36Updated last year
- ☆20Updated last year
- ☆42Updated last year
- Function Calling Mistral 7B. Learn how to make functions call for open source LLMs.☆48Updated last year
- ☆45Updated last year
- Synthetic Data Generation using LLM via Argilla, Distilabel, ChatGPT, etc.☆30Updated last year
- ☆16Updated last year
- ☆29Updated last year
- Testing speed and accuracy of RAG with, and without Cross Encoder Reranker.☆48Updated last year
- 🚀 Scale your RAG pipeline using Ragswift: A scalable centralized embeddings management platform☆38Updated last year
- Using LlamaIndex with Ray for productionizing LLM applications☆71Updated last year
- Finetune any model on HF in less than 30 seconds☆57Updated 3 months ago