kemingy / vllm-env
setup the env for vllm users
☆16Updated last year
Related projects ⓘ
Alternatives and complementary repositories for vllm-env
- Sentence Embedding as a Service☆14Updated last year
- Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)☆27Updated last year
- This repository contains statistics about the AI Infrastructure products.☆18Updated 4 months ago
- Trace LLM calls (and others) and visualize them in WandB, as interactive SVG or using a streaming local webapp☆14Updated 10 months ago
- A collection of models built with ColossalAI☆32Updated 2 years ago
- Some microbenchmarks and design docs before commencement☆12Updated 3 years ago
- OpenAI compatible API for open source LLMs☆15Updated last year
- ☆35Updated this week
- Deploy ChatGLM on Modelz☆15Updated last year
- OVALChat is a customizable Web app aimed at conducting user studies with chatbots☆27Updated 10 months ago
- A memory efficient DLRM training solution using ColossalAI☆100Updated 2 years ago
- Simple dependency injection framework for Python☆20Updated 6 months ago
- Visualize expert firing frequencies across sentences in the Mixtral MoE model☆17Updated 10 months ago
- ☆26Updated last year
- ☆14Updated last year
- Efficient and Scalable Estimation of Tool Representations in Vector Space☆16Updated 2 months ago
- ☆22Updated 3 months ago
- Showcase how mxbai-embed-large-v1 can be used to produce binary embedding. Binary embeddings enabled 32x storage savings and 40x faster r…☆16Updated 7 months ago
- ☆33Updated this week
- Evaluation for AI apps and agent☆35Updated 10 months ago
- A lightweight script for processing HTML page to markdown format with support for code blocks☆73Updated 7 months ago
- Cortex-compatible model server for Python and TensorFlow☆16Updated last year
- Official repository for RAGViz: Diagnose and Visualize Retrieval-Augmented Generation [EMNLP 2024]☆46Updated this week
- Benchmark suite for LLMs from Fireworks.ai☆58Updated 2 weeks ago
- Large-scale exact string matching tool☆15Updated last week
- ☆16Updated 5 months ago
- Creating Generative AI Apps which work☆16Updated 4 months ago
- Accelerating your LLM training to full speed☆37Updated this week
- Experiments w/ ChatGPT, LangChain, local LLMs☆24Updated last year