opendatahub-io / vllm-tgis-adapter
vLLM adapter for a TGIS-compatible gRPC server.
☆10Updated this week
Related projects ⓘ
Alternatives and complementary repositories for vllm-tgis-adapter
- ☆36Updated 2 years ago
- ☆43Updated 2 months ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆32Updated this week
- This library supports evaluating disparities in generated image quality, diversity, and consistency between geographic regions.☆20Updated 5 months ago
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.☆51Updated this week
- ☆41Updated 2 weeks ago
- Zeta implementation of a reusable and plug in and play feedforward from the paper "Exponentially Faster Language Modeling"☆15Updated last week
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆34Updated last year
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆38Updated 10 months ago
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆23Updated 8 months ago
- A toolkit enhances PyTorch with specialized functions for low-bit quantized neural networks.☆28Updated 4 months ago
- ☆45Updated 2 months ago
- ☆15Updated last year
- XTR: Rethinking the Role of Token Retrieval in Multi-Vector Retrieval☆37Updated 5 months ago
- MEXMA: Token-level objectives improve sentence representations☆34Updated 2 weeks ago
- ☆36Updated 3 weeks ago
- A place to store reusable transformer components of my own creation or found on the interwebs☆44Updated 2 weeks ago
- Data preparation code for CrystalCoder 7B LLM☆42Updated 6 months ago
- Make triton easier☆41Updated 5 months ago
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆37Updated 7 months ago
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆11Updated 5 months ago
- Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…☆14Updated last year
- Using open source LLMs to build synthetic datasets for direct preference optimization☆40Updated 8 months ago
- My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated☆30Updated 3 months ago
- QuIP quantization☆46Updated 8 months ago
- ☆32Updated last year
- QLoRA with Enhanced Multi GPU Support☆36Updated last year
- This repository contains code for cleaning your training data of benchmark data to help combat data snooping.☆25Updated last year