hpcaitech / SwiftInferLinks

Efficient AI Inference & Serving

☆478

Alternatives and similar repositories for SwiftInfer

Users that are interested in SwiftInfer are comparing it to the libraries listed below

Sorting:

inferflow / inferflow
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
☆248Updated last year
modelscope / dash-infer
DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …
☆266Updated 2 months ago
vectorch-ai / ScaleLLM
A high-performance inference system for large language models, designed for production environments.
☆479Updated 2 weeks ago
alipay / PainlessInferenceAcceleration
Accelerate inference without tears
☆361Updated last week
FlagAI-Open / Aquila2
The official repo of Aquila2 series proposed by BAAI, including pretrained & chat large language models.
☆445Updated last year
ninehills / llm-inference-benchmark
LLM Inference benchmark
☆428Updated last year
IEIT-Yuan / Yuan2.0-M32
Mixture-of-Experts (MoE) Language Model
☆189Updated last year
open-compass / MixtralKit
A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI
☆771Updated last year
IEIT-Yuan / Yuan-2.0
Yuan 2.0 Large Language Model
☆688Updated last year
QwenLM / qwen.cpp
C++ implementation of Qwen-LM
☆606Updated 10 months ago
kwai / KwaiYii
☆229Updated 2 years ago
WangRongsheng / Aurora
The official codes for "Aurora: Activating chinese chat capability for Mixtral-8x7B sparse Mixture-of-Experts through Instruction-Tuning"
☆264Updated last year
HIT-SCIR / Chinese-Mixtral-8x7B
中文Mixtral-8x7B（Chinese-Mixtral-8x7B）
☆654Updated last year
mindspore-lab / mindformers
☆175Updated this week
OpenBMB / BMTrain
Efficient Training (including pre-training and fine-tuning) for Big Models
☆611Updated last month
DachengLi1 / LongChat
Official repository for LongChat and LongEval
☆531Updated last year
InternLM / InternEvo
InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencie…
☆410Updated 2 months ago
QwenLM / vllm-gptq
A high-throughput and memory-efficient inference and serving engine for LLMs
☆138Updated 10 months ago
Tencent / KsanaLLM
☆507Updated last month
Chinese-Tiny-LLM / Chinese-Tiny-LLM
☆234Updated last year
alibaba / Megatron-LLaMA
Best practice for training LLaMA models in Megatron-LM
☆660Updated last year
hpcaitech / EnergonAI
Large-scale model inference.
☆631Updated 2 years ago
intel / xFasterTransformer
☆430Updated last month
GPT-Fathom / GPT-Fathom
GPT-Fathom is an open-source and reproducible LLM evaluation suite, benchmarking 10+ leading open-source and closed-source LLMs as well a…
☆347Updated last year
OpenMOSS / CoLLiE
Collaborative Training of Large Language Models in an Efficient Way
☆414Updated last year
bytedance / ByteTransformer
optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052
☆479Updated last year
xverse-ai / XVERSE-13B
XVERSE-13B: A multilingual large language model developed by XVERSE Technology Inc.
☆645Updated last year
THUDM / FasterTransformer
Transformer related optimization, including BERT, GPT
☆39Updated 2 years ago
THUDM / MathGLM
Official Pytorch Implementation for MathGLM
☆327Updated last year
alibaba / ChatLearn
A flexible and efficient training framework for large-scale alignment tasks
☆431Updated last week