nicholasyager / llama-cpp-guidanceLinks
A guidance compatibility layer for llama-cpp-python
☆36Updated last year
Alternatives and similar repositories for llama-cpp-guidance
Users that are interested in llama-cpp-guidance are comparing it to the libraries listed below
Sorting:
- Plug n Play GBNF Compiler for llama.cpp☆27Updated last year
- GPT-2 small trained on phi-like data☆67Updated last year
- Let's create synthetic textbooks together :)☆75Updated last year
- Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platform☆89Updated last month
- A stable, fast and easy-to-use inference library with a focus on a sync-to-async API☆45Updated 10 months ago
- The one who calls upon functions - Function-Calling Language Model☆36Updated last year
- ☆66Updated last year
- Easily create LLM automation/agent workflows☆59Updated last year
- ☆38Updated last year
- Convenient wrapper for fine-tuning and inference of Large Language Models (LLMs) with several quantization techniques (GTPQ, bitsandbytes…☆146Updated last year
- Experimental LLM Inference UX to aid in creative writing☆120Updated 8 months ago
- Client-side toolkit for using large language models, including where self-hosted☆112Updated 9 months ago
- autologic is a Python package that implements the SELF-DISCOVER framework proposed in the paper SELF-DISCOVER: Large Language Models Self…☆60Updated last year
- ☆74Updated last year
- run ollama & gguf easily with a single command☆52Updated last year
- A fast batching API to serve LLM models☆185Updated last year
- Chat Markup Language conversation library☆55Updated last year
- Easily view and modify JSON datasets for large language models☆81Updated 3 months ago
- ☆32Updated last year
- ☆116Updated 8 months ago
- DSPy program/pipeline inspector widget for Jupyter/VSCode Notebooks.☆37Updated last year
- Complex RAG backend☆29Updated last year
- Low-Rank adapter extraction for fine-tuned transformers models☆175Updated last year
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆90Updated last month
- A simple experiment on letting two local LLM have a conversation about anything!☆110Updated last year
- Serving LLMs in the HF-Transformers format via a PyFlask API☆71Updated 11 months ago
- 🚀 Scale your RAG pipeline using Ragswift: A scalable centralized embeddings management platform☆38Updated last year
- large language model for mastering data analysis using pandas☆47Updated last year
- Distributed Inference for mlx LLm☆93Updated last year
- Simple Graph Memory for AI applications☆89Updated 3 months ago