NickCrews / llama-cpp-server-pythonLinks
Bootstrap a server from llama-cpp in a few lines of python
☆12Updated last year
Alternatives and similar repositories for llama-cpp-server-python
Users that are interested in llama-cpp-server-python are comparing it to the libraries listed below
Sorting:
- A pipeline for LLM knowledge distillation☆112Updated 9 months ago
- ☆50Updated last year
- LLM-Training-API: Including Embeddings & ReRankers, mergekit, LaserRMT☆27Updated last year
- ☆138Updated 4 months ago
- Lightweight continuous batching OpenAI compatibility using HuggingFace Transformers include T5 and Whisper.☆29Updated 9 months ago
- ☆108Updated 4 months ago
- Low-Rank adapter extraction for fine-tuned transformers models☆180Updated last year
- ☆51Updated last year
- ☆165Updated 5 months ago
- Easy to use, High Performant Knowledge Distillation for LLMs☆96Updated 8 months ago
- A stable, fast and easy-to-use inference library with a focus on a sync-to-async API☆47Updated last year
- Code for the MTEB Arena☆24Updated 6 months ago
- entropix style sampling + GUI☆27Updated last year
- ☆243Updated 3 months ago
- Work with your business data using natural language☆19Updated last year
- Universal text classifier for generative models☆24Updated last year
- ☆68Updated last year
- A pipeline parallel training script for LLMs.☆165Updated 8 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆69Updated last month
- Scripts for text classification with llama and bert☆29Updated 5 months ago
- C++ inference wrappers for running blazing fast embedding services on your favourite serverless like AWS Lambda. By Prithivi Da, PRs welc…☆23Updated last year
- This reference can be used with any existing OpenAI integrated apps to run with TRT-LLM inference locally on GeForce GPU on Windows inste…☆127Updated last year
- Enhancing Translation with RAG-Powered Large Language Models☆88Updated last week
- This is the reproduction repository for my 🤗 Hugging Face blog post on synthetic data☆68Updated last year
- This is our own implementation of 'Layer Selective Rank Reduction'☆240Updated last year
- Simple examples using Argilla tools to build AI☆57Updated last year
- Easily view and modify JSON datasets for large language models☆86Updated 7 months ago
- Load multiple LoRA modules simultaneously and automatically switch the appropriate combination of LoRA modules to generate the best answe…☆157Updated last year
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆78Updated last year
- Using open source LLMs to build synthetic datasets for direct preference optimization☆72Updated last year