tangledgroup / llama-cpp-python-exploitLinks
llama-cpp-python-exploit
☆15Updated last year
Alternatives and similar repositories for llama-cpp-python-exploit
Users that are interested in llama-cpp-python-exploit are comparing it to the libraries listed below
Sorting:
- llm-agent-smith☆12Updated last year
- Python bindings for the Transformer models implemented in C/C++ using GGML library.☆1,868Updated last year
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆4,228Updated last week
- ☆2,981Updated 10 months ago
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆2,887Updated last year
- Fine-tune mistral-7B on 3090s, a100s, h100s☆714Updated last year
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆3,560Updated last month
- Large-scale LLM inference engine☆1,471Updated last week
- Run Mixtral-8x7B models in Colab or consumer desktops☆2,312Updated last year
- [EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which ach…☆5,262Updated 4 months ago
- Fast & Simple repository for pre-training and fine-tuning T5-style models☆1,005Updated 10 months ago
- A language for constraint-guided and efficient LLM programming.☆3,992Updated last month
- High-speed Large Language Model Serving for Local Deployment☆8,231Updated 4 months ago
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.☆8,631Updated last year
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.☆6,011Updated 3 months ago
- Chat language model that can use tools and interpret the results☆1,569Updated last month
- Robust recipes to align language models with human and AI preferences☆5,260Updated this week
- PyTorch native post-training library☆5,323Updated this week
- Tools for merging pretrained large language models.☆6,016Updated 3 weeks ago
- Ungreedy subword tokenizer and vocabulary trainer for Python, Go & Javascript☆588Updated last year
- ☆986Updated 5 months ago
- Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"☆1,062Updated last year
- [ICLR 2024] Efficient Streaming Language Models with Attention Sinks☆6,927Updated last year
- ⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Pl…☆2,167Updated 9 months ago
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,473Updated this week
- 🤖 A PyTorch library of curated Transformer models and their composable components☆892Updated last year
- Convolutions for Sequence Modeling☆891Updated last year
- A benchmark to evaluate language models on questions I've previously asked them to solve.☆1,022Updated 2 months ago
- An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.☆4,891Updated 3 months ago
- Collection of notebook guides created by the Brev.dev team!☆1,774Updated 3 weeks ago