AI Infra LLM infer/ tensorrt-llm/ vllm
☆23Mar 7, 2026Updated 2 weeks ago
Alternatives and similar repositories for llm-deploy
Users that are interested in llm-deploy are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Deep insight tensorrt, including but not limited to qat, ptq, plugin, triton_inference, cuda☆23Mar 8, 2026Updated 2 weeks ago
- qt gui application for Facobook's Segment Anything Model(SAM).☆10Jun 15, 2023Updated 2 years ago
- FastSAM 部署版本,便于移植不同平,部署简单、运行速度快。☆24May 30, 2024Updated last year
- [ICLR 2026] Official implementation of DiCache: Let Diffusion Model Determine Its Own Cache☆58Jan 26, 2026Updated 2 months ago
- Project for SIGGRAPH 2022 paper "Interactive Augmented Reality Storytelling Guided by Scene Semantics"☆20Jul 20, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Repository for the COLM 2025 paper SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths☆17Jul 10, 2025Updated 8 months ago
- Instruction tuning dataset generation inspired by LLaVA-Instruct-158k via any LLM, also for commercial use.☆13Mar 13, 2024Updated 2 years ago
- A lightweight and easy to use async IO library implemented with io_uring and C++20 coroutine.☆13Feb 11, 2025Updated last year
- LLM Agents: Landing Page Generation for an E-commerce Platform Using CrewAI, Groq-LangChain and Qdrant☆15May 30, 2024Updated last year
- ☆20Jun 1, 2023Updated 2 years ago
- DAMIAO motor control drive ,达妙电机Linux下电机驱动,基于开源的再次魔改☆41Nov 13, 2024Updated last year
- ☆10Dec 14, 2020Updated 5 years ago
- ☆16Sep 24, 2019Updated 6 years ago
- Ray Framework (https://github.com/ray-project/ray) on Kubernetes☆13Oct 12, 2018Updated 7 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A sumary of MoE experimental setups across a number of different papers.☆16Feb 16, 2023Updated 3 years ago
- A High-Level DRAM Timing, Power and Area Exploration Tool☆29Jul 29, 2020Updated 5 years ago
- Inference speed-up for stable-diffusion (ldm) with TensorRT.☆35Jun 19, 2023Updated 2 years ago
- A repository for all the STRANDS-augmented movebase, including 3D obstacle avoidance, etc.☆10Nov 26, 2019Updated 6 years ago
- B-Roll: Video data in rosbag2 plugins and utilities☆12Nov 19, 2025Updated 4 months ago
- Ament task provider extension for vscode☆10Mar 16, 2026Updated last week
- 第一代个人机械臂,欢迎PR☆38Jun 9, 2025Updated 9 months ago
- 思考与总结☆10Feb 18, 2016Updated 10 years ago
- Damiao Motor Control Library – A Python library for controlling Damiao motors via CAN. Supports Windows, Linux, and macOS. Flexible contr…☆53Dec 16, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- SimplePIM is the first high-level programming framework for real-world processing-in-memory (PIM) architectures. Described in the PACT 20…☆31Oct 23, 2023Updated 2 years ago
- ros package for z1 simulation☆42Dec 31, 2024Updated last year
- Examples of using Isaac ROS GEMs together☆17Updated this week
- Paper summary of 2D video generation. Updated 2021.06☆16Apr 30, 2021Updated 4 years ago
- ROS2 package that allows recording without interprocess communication☆17Feb 25, 2025Updated last year
- Docker + PaddleOCR + FastAPI☆26Feb 1, 2023Updated 3 years ago
- ☆11Aug 15, 2025Updated 7 months ago
- ☆11Aug 24, 2025Updated 7 months ago
- Framework for creating ROS dashboards in RQT☆12Jan 26, 2026Updated 2 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM☆180Jul 12, 2024Updated last year
- Simple example application to show how to allocate dmabufs from user space (from a dmabuf heap) and use them for v4l2 capture.☆16Aug 2, 2023Updated 2 years ago
- Simple example of using pybind11 via Bazel.☆11Apr 30, 2020Updated 5 years ago
- ☆17Jun 10, 2025Updated 9 months ago
- Tensorflow code for WACV 2019 paper "Attention Based Natural Language Grounding by Navigating Virtual Environment" - https://arxiv.org/ab…☆17Nov 7, 2018Updated 7 years ago
- [COLM 2024] SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Models☆24Oct 5, 2024Updated last year
- Provides a demo of micro-ROS based on ST Disco L475 IOT01 board.☆13Jul 14, 2021Updated 4 years ago