menloresearch / cortex.tensorrt-llmLinks
Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU accelerated inference on NVIDIA's GPUs.
☆42Updated 11 months ago
Alternatives and similar repositories for cortex.tensorrt-llm
Users that are interested in cortex.tensorrt-llm are comparing it to the libraries listed below
Sorting:
- Tcurtsni: Reverse Instruction Chat, ever wonder what your LLM wants to ask you?