kemingy / vllm-env
setup the env for vllm users
☆16Updated last year
Alternatives and similar repositories for vllm-env:
Users that are interested in vllm-env are comparing it to the libraries listed below
- Sentence Embedding as a Service☆14Updated last year
- A collection of models built with ColossalAI☆32Updated 2 years ago
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆36Updated last year
- Reasoning by Communicating with Agents☆24Updated 4 months ago
- A Python implementation of Toolformer using Huggingface Transformers☆15Updated last year
- Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)☆28Updated last year
- Evaluation of bm42 sparse indexing algorithm☆64Updated 7 months ago
- Large-scale exact string matching tool☆15Updated 3 months ago
- Benchmark suite for LLMs from Fireworks.ai☆66Updated last week
- Deploy ChatGLM on Modelz☆15Updated last year
- A memory efficient DLRM training solution using ColossalAI☆101Updated 2 years ago
- kimi-chat 测试数据☆7Updated last year
- Official implementation for 'Extending LLMs’ Context Window with 100 Samples'☆76Updated last year
- Evaluation for AI apps and agent☆36Updated last year
- An open-source NLP library: fast text cleaning and preprocessing☆23Updated 3 years ago
- ☆84Updated last year
- LLMs as Collaboratively Edited Knowledge Bases☆44Updated 11 months ago
- The "GPT-API-Accelerate" project provides a set of Python classes for accelerating the process of generating responses to prompts using t…☆22Updated 4 months ago
- The multilingual variant of GLM, a general language model trained with autoregressive blank infilling objective☆62Updated 2 years ago
- ☆16Updated 8 months ago
- ☆53Updated 8 months ago
- A lightweight script for processing HTML page to markdown format with support for code blocks☆78Updated 10 months ago
- Data preparation code for CrystalCoder 7B LLM☆44Updated 9 months ago
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Updated last year
- ☆17Updated last year
- Showcase how mxbai-embed-large-v1 can be used to produce binary embedding. Binary embeddings enabled 32x storage savings and 40x faster r…☆15Updated 10 months ago
- A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.☆62Updated 10 months ago
- [ICML 2023] "Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation", Wenqing Zheng, S P Sharan, Ajay Kumar Jaiswal, …☆40Updated last year
- Efficient and Scalable Estimation of Tool Representations in Vector Space☆18Updated 5 months ago