LLM 推理服务性能测试
☆44Dec 17, 2023Updated 2 years ago
Alternatives and similar repositories for llm-inference-benchmark
Users that are interested in llm-inference-benchmark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- survery of small language models☆18Jul 23, 2024Updated last year
- Due to the huge vocaburary size (151,936) of Qwen models, the Embedding and LM Head weights are excessively heavy. Therefore, this projec…☆38Jan 6, 2026Updated 4 months ago
- THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.☆11May 6, 2023Updated 3 years ago
- 基于电商导购机器人,自然语言理解(NLU),文本纠错,歧义词消歧☆12May 5, 2020Updated 6 years ago
- LLM Inference benchmark☆437Jul 23, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Uniformaly: Towards Task-Agnostic Unified Anomaly Detection☆15Sep 15, 2023Updated 2 years ago
- This is the official implementation for our paper;"LAR:Look Around and Refer".☆30Dec 1, 2022Updated 3 years ago
- A human-friendly implementation of the iRobot Open Interface version 2 API.☆14May 14, 2016Updated 9 years ago
- [EMNLP2023]: MIRACLE: Towards Personalized Dialogue Generation with Latent-Space Multiple Personal Attribute Control☆12Nov 11, 2023Updated 2 years ago
- Implementation of AdaCQR(COLING 2025)☆15Dec 30, 2024Updated last year
- ☆12Mar 19, 2022Updated 4 years ago
- Repository for the COLM 2025 paper SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths☆18Jul 10, 2025Updated 9 months ago
- Python library written in Rust for creating/transporting/parsing AI characters between different frontends (TavernAI, SillyTavern, TextGe…☆21Nov 14, 2025Updated 5 months ago
- CVPR25☆27Jul 2, 2025Updated 10 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- text security audit 安全审核-语义模型过滤 敏感内容检测系统☆39Feb 14, 2025Updated last year
- 景区综合管理平台 ----echats 和 大屏 的完美结合 ,大屏宽度(百分比)高度(rem)自适应☆11Apr 27, 2018Updated 8 years ago
- Measuring and Controlling Persona Drift in Language Model Dialogs☆24Feb 26, 2024Updated 2 years ago
- LLM Agents: Landing Page Generation for an E-commerce Platform Using CrewAI, Groq-LangChain and Qdrant☆15May 30, 2024Updated last year
- High-Performance Linpack Benchmark adopted version for GPU backend☆12Sep 12, 2022Updated 3 years ago
- AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (N…☆12Jun 24, 2024Updated last year
- A heavy modification of the original c_uart_interface_example, works on ARM Cortex-M4 STM32F4 (as an offboard processor)☆11Jul 8, 2016Updated 9 years ago
- Notes on putting micropython on STM32F407VG bare board☆11Oct 7, 2019Updated 6 years ago
- Visual self-questioning for large vision-language assistant.☆44Jul 23, 2025Updated 9 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆19Oct 6, 2025Updated 7 months ago
- A simple, easy-to-hack GraphRAG implementation☆15Sep 21, 2024Updated last year
- 大语言模型评估平台,支持多种评估基准、自定义数据集和性能测试。支持基于自定义数据集的RAG评估。☆87Aug 20, 2025Updated 8 months ago
- Implementation of various algorithms in the Nested Sequential Monte Carlo family of methods.☆14Sep 9, 2015Updated 10 years ago
- ☆10Jul 18, 2024Updated last year
- ☆15Apr 13, 2024Updated 2 years ago
- H1ve-theme和CTFd-owl汉化☆18Nov 10, 2022Updated 3 years ago
- Pre-built ROCm-GDB and GPU Debug SDK binaries☆16Mar 21, 2019Updated 7 years ago
- socat - Multipurpose relay (cloned from git://repo.or.cz/socat.git) http://www.dest-unreach.org/socat/☆21Jan 24, 2016Updated 10 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- 用大模型批量处理数据,现支持各种大模型做OCR,支持通义千问, 月之暗面, 百度飞桨OCR, OpenAI 和LLAVA。Use LLM to generate or clean data for academic use. Support OCR with qwen, m…☆16Sep 15, 2024Updated last year
- 该部分为自己在学习tensorflow2.0中实现的各种模型还有算法,供大家参考☆20Jul 30, 2020Updated 5 years ago
- a game framework. warning: wip, dev, unstable, radiation hazard, defcon 3☆24May 10, 2015Updated 10 years ago
- ☆16Nov 19, 2025Updated 5 months ago
- ☆12Jan 25, 2023Updated 3 years ago
- [NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"☆47Dec 1, 2024Updated last year
- 系统规划与管理师学习笔记☆12Apr 7, 2021Updated 5 years ago