Yard1 / Ray-DeepSpeed-InferenceLinks
☆17Updated 2 years ago
Alternatives and similar repositories for Ray-DeepSpeed-Inference
Users that are interested in Ray-DeepSpeed-Inference are comparing it to the libraries listed below
Sorting:
- Benchmark suite for LLMs from Fireworks.ai☆75Updated 2 weeks ago
- A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.☆67Updated last year
- The "GPT-API-Accelerate" project provides a set of Python classes for accelerating the process of generating responses to prompts using t…☆23Updated 7 months ago
- Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).☆242Updated last year
- ☆52Updated 6 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆130Updated 11 months ago
- Experiments on speculative sampling with Llama models☆126Updated last year
- ☆53Updated last year
- fastertransformer for codegeex model☆63Updated last year
- Spherical Merge Pytorch/HF format Language Models with minimal feature loss.☆123Updated last year
- Official implementation for 'Extending LLMs’ Context Window with 100 Samples'☆78Updated last year
- Evaluation for AI apps and agent☆41Updated last year
- Light local website for displaying performances from different chat models.☆87Updated last year
- Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper☆136Updated 10 months ago
- ☆40Updated last year
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆132Updated 11 months ago
- Data preparation code for CrystalCoder 7B LLM☆44Updated last year
- This is a text generation method which returns a generator, streaming out each token in real-time during inference, based on Huggingface/…☆95Updated last year
- ☆76Updated last year
- Efficient, Flexible, and Highly Fault-Tolerant Model Service Management Based on SGLang☆53Updated 6 months ago
- Leveraging large language models for text-to-SQL synthesis, this project fine-tunes WizardLM/WizardCoder-15B-V1.0 with QLoRA on a custom …☆44Updated last year
- Retrieves parquet files from Hugging Face, identifies and quantifies junky data, duplication, contamination, and biased content in datase…☆53Updated last year
- A pipeline for LLM knowledge distillation☆104Updated 2 months ago
- Data preparation code for Amber 7B LLM☆90Updated last year
- Unofficial implementation of AlpaGasus☆91Updated last year
- ☆118Updated last year
- ☆193Updated 3 weeks ago
- Pre-training code for CrystalCoder 7B LLM☆54Updated last year
- Reformatted Alignment☆114Updated 8 months ago
- ☆105Updated last year