A flexible serving framework that delivers efficient and fault-tolerant LLM inference for clustered deployments.
☆92Jun 3, 2026Updated this week
Alternatives and similar repositories for xllm-service
Users that are interested in xllm-service are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- a mllm inference engine for academic research☆22Jan 30, 2026Updated 4 months ago
- Large Language Model (LLM) Serving Paper and Resource List☆28Apr 16, 2026Updated last month
- ☆98Mar 26, 2025Updated last year
- A high-performance inference system for large language models, designed for production environments.☆500Dec 19, 2025Updated 5 months ago
- GEMM by WMMA (tensor core)☆15Jul 31, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Dataset for AAAI paper "Natural Language Inference in Context - Investigating Contextual Reasoning over Long Texts"☆11Nov 18, 2022Updated 3 years ago
- 本科毕设 - 一个基于FinGLM的多模态大模型的金融问答系统☆32Jun 26, 2024Updated last year
- The official implementation of InfoRM [NeurIPS 2024].☆15Oct 25, 2025Updated 7 months ago
- OrqueIO main source code repository☆37May 26, 2026Updated 2 weeks ago
- mobileNet SSD 基于caffe的前向检测☆10Nov 30, 2018Updated 7 years ago
- Deep Large-Scale Inference UsingKnockoffs☆11Nov 4, 2021Updated 4 years ago
- Code and data release of the paper Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows☆15Oct 4, 2024Updated last year
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆17Dec 19, 2024Updated last year
- HELP: a dataset for Handling Entailments with Lexical and logical Phenomena (Ver.1.0)☆15Jul 20, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆12Dec 21, 2022Updated 3 years ago
- Implementation of Hyena Hierarchy in JAX☆10Apr 30, 2023Updated 3 years ago
- ☆17Apr 10, 2025Updated last year
- Depict GPU memory footprint during DNN training of PyTorch☆11Nov 17, 2022Updated 3 years ago
- ☆10Jun 29, 2020Updated 5 years ago
- Benchmark tests supporting the TiledCUDA library.☆19Nov 19, 2024Updated last year
- ☆12Sep 1, 2023Updated 2 years ago
- ☆12Mar 13, 2023Updated 3 years ago
- 用C++和Python实现从头实现一个深度学习训练框架☆12Nov 22, 2020Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- a simple API to use CUPTI☆10Aug 19, 2025Updated 9 months ago
- A toolkit for developers to simplify the transformation of nn.Module instances. It's now corresponding to Pytorch.fx.☆13Apr 7, 2023Updated 3 years ago
- ☆11Apr 5, 2021Updated 5 years ago
- 使用 cutlass 仓库在 ada 架构上实现 fp8 的 flash attention☆81Aug 12, 2024Updated last year
- Short RL☆18Apr 16, 2026Updated last month
- Collection of LLM completions for reasoning-gym task datasets☆31Jul 4, 2025Updated 11 months ago
- ☕️ A vscode extension for netron, support *.pdmodel, *.nb, *.onnx, *.pb, *.h5, *.tflite, *.pth, *.pt, *.mnn, *.param, etc.☆14Jun 4, 2023Updated 3 years ago
- ssd/mobile ssd/yolo v2/yolo v3 implement in opencv3☆10Dec 12, 2018Updated 7 years ago
- This repo holds the research projects of our lab.☆11Jan 20, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [CVPR2026] ColaVLA: Leveraging Cognitive Latent Reasoning for Hierarchical Parallel Trajectory Planning in Autonomous Driving & [CVPR2025…☆56Mar 26, 2026Updated 2 months ago
- dataloader for mocap dataset☆36Oct 21, 2025Updated 7 months ago
- Composable Data and Type Generators for C++☆10Mar 25, 2019Updated 7 years ago
- ☆26Dec 30, 2025Updated 5 months ago
- ☆18Nov 30, 2025Updated 6 months ago
- 训练速度比原始caffe-ssd提升4~6倍☆10Jun 22, 2021Updated 4 years ago
- Nex General Agentic Data Pipeline, an end-to-end pipeline for generating high-quality agentic training data.☆33Nov 19, 2025Updated 6 months ago