thekevinscott / vicuna-7bLinks
Vicuna 7B is a large language model that runs in the browser. Exposes programmatic access with minimal configuration.
☆20Updated 2 years ago
Alternatives and similar repositories for vicuna-7b
Users that are interested in vicuna-7b are comparing it to the libraries listed below
Sorting:
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub☆160Updated 2 years ago
- Advanced Ultra-Low Bitrate Compression Techniques for the LLaMA Family of LLMs☆110Updated last year
- LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.☆129Updated 2 years ago
- PB-LLM: Partially Binarized Large Language Models☆156Updated 2 years ago
- AI Assistant running within your browser.☆76Updated 11 months ago
- HuggingChat like UI in Gradio☆70Updated 2 years ago
- [ICLR 2024] Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation☆180Updated last year
- GPTQLoRA: Efficient Finetuning of Quantized LLMs with GPTQ☆102Updated 2 years ago
- LLM Chat is an open-source serverless alternative to ChatGPT.☆35Updated last year
- Experiments with BitNet inference on CPU☆54Updated last year
- ☆62Updated 10 months ago
- An all-new Language Model That Processes Ultra-Long Sequences of 100,000+ Ultra-Fast☆150Updated last year
- ☆52Updated last year
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆37Updated last month
- A high-throughput and memory-efficient inference and serving engine for LLMs☆53Updated last year
- Train your own small bitnet model☆74Updated last year
- Data preparation code for Amber 7B LLM☆93Updated last year
- Zero-trust AI APIs for easy and private consumption of open-source LLMs☆40Updated last year
- ☆67Updated last year
- ☆36Updated last year
- A general 2-8 bits quantization toolbox with GPTQ/AWQ/HQQ/VPTQ, and export to onnx/onnx-runtime easily.☆182Updated 7 months ago
- ☆53Updated last year
- Merge Transformers language models by use of gradient parameters.☆208Updated last year
- Ultra Fast Multi-Modality Vector Database☆17Updated last year
- GGUF Quantization of any LLM.☆41Updated last year
- LLM finetuning☆41Updated 2 years ago
- A repository dedicated to evaluating the performance of quantizied LLaMA3 using various quantization methods..☆196Updated 10 months ago
- a simplified version of Google's Gemma model to be used for learning☆26Updated last year
- Host the GPTQ model using AutoGPTQ as an API that is compatible with text generation UI API.☆90Updated 2 years ago
- GPT-2 small trained on phi-like data☆67Updated last year