modelize-ai / LLM-Inference-Deployment-TutorialLinks
Tutorial for LLM developers about engine design, service deployment, evaluation/benchmark, etc. Provide a C/S style optimized LLM inference engine.
☆19Updated 2 years ago
Alternatives and similar repositories for LLM-Inference-Deployment-Tutorial
Users that are interested in LLM-Inference-Deployment-Tutorial are comparing it to the libraries listed below
Sorting:
- Open Implementations of LLM Analyses☆107Updated last year
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆40Updated 2 years ago
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆45Updated last year
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆178Updated 2 years ago
- Manages vllm-nccl dependency☆17Updated last year
- Benchmark baseline for retrieval qa applications☆118Updated last year
- A collection of reproducible inference engine benchmarks☆38Updated 8 months ago
- OpenLLMDE: An open source data engineering framework for LLMs☆18Updated 2 years ago
- This is a text generation method which returns a generator, streaming out each token in real-time during inference, based on Huggingface/…☆96Updated last year
- Data preparation code for CrystalCoder 7B LLM☆45Updated last year
- FuseAI Project☆88Updated 11 months ago
- Leveraging large language models for text-to-SQL synthesis, this project fine-tunes WizardLM/WizardCoder-15B-V1.0 with QLoRA on a custom …☆45Updated 2 years ago
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆78Updated last year
- 中文金融大模型测评基准,六大类二十五任务、等级化评价,国内模型获得A级☆10Updated last year
- Open sourced backend for Martian's LLM Inference Provider Leaderboard☆19Updated last year
- ☆63Updated last year
- Data preparation code for Amber 7B LLM☆94Updated last year
- Advanced Reasoning Benchmark Dataset for LLMs☆47Updated 2 years ago
- ☆96Updated last year
- Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆40Updated last year
- ☆17Updated last year
- Evaluation tools for Retrieval-augmented Generation (RAG) methods.☆167Updated last year
- Understanding the correlation between different LLM benchmarks☆29Updated last year
- Code for paper titled "Towards the Law of Capacity Gap in Distilling Language Models"☆102Updated last year
- Inference script for Meta's LLaMA models using Hugging Face wrapper☆110Updated 2 years ago
- [EMNLP 2023 Industry Track] A simple prompting approach that enables the LLMs to run inference in batches.☆77Updated last year
- A list of LLM benchmark frameworks.☆73Updated last year
- PROSE Public Benchmark Suite☆28Updated 3 months ago
- A light proxy solution for HuggingFace hub.☆49Updated 2 years ago
- experiments with inference on llama☆103Updated last year