Getting started with TensorRT-LLM using BLOOM as a case study
☆24Mar 7, 2024Updated 2 years ago
Alternatives and similar repositories for TensorRT-LLM-Tutorial
Users that are interested in TensorRT-LLM-Tutorial are comparing it to the libraries listed below
Sorting:
- This repository provides the data and the codes used in the AAAI'24 paper, COOPER: Coordinating Specialized Agents towards a Complex Dial…☆27Mar 1, 2024Updated 2 years ago
- Chat language model that can interpret and execute functions/plugins☆14Oct 16, 2024Updated last year
- Easy-to-use Retrieval-Enhanced Transformer implementation☆10Sep 30, 2022Updated 3 years ago
- Repo for paper: Controllable Text Generation with Language Constraints☆20Jun 20, 2023Updated 2 years ago
- ☆25Mar 19, 2024Updated 2 years ago
- ☆24Apr 30, 2025Updated 10 months ago
- Algorithms for approximate attention in LLMs☆21Apr 14, 2025Updated 11 months ago
- NeRF with clean and well-annotated PyTorch re-implementation☆18Oct 13, 2023Updated 2 years ago
- Convert Wiktionary entries to various formats such as StarDict or DB (MariaDB/MySQL). I'm dropping the database support for this new main…☆17Oct 5, 2025Updated 5 months ago
- LLM inference in C/C++☆107Mar 11, 2026Updated last week
- [COLM 2024] SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Models☆24Oct 5, 2024Updated last year
- ☆38Nov 24, 2020Updated 5 years ago
- This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"☆36Jul 11, 2024Updated last year
- 3rd party dependencies for DALI project☆11Mar 10, 2026Updated last week
- This is the repository of our ACL 2024 paper "ESCoT: Towards Interpretable Emotional Support Dialogue Systems".☆38May 10, 2025Updated 10 months ago
- This kernel adds supports for running Docker on Sony Xperia 5 II (pdx206).☆10Mar 14, 2023Updated 3 years ago
- The official code of "PixelWorld: Towards Perceiving Everything as Pixels" [TMLR25]☆16Sep 12, 2025Updated 6 months ago
- 一些简单的scripts,慢慢push☆14Apr 18, 2024Updated last year
- ☆12Apr 29, 2021Updated 4 years ago
- ☆76Mar 7, 2024Updated 2 years ago
- Setup an MCP server in 60 seconds.☆13Dec 12, 2024Updated last year
- Fine-tuning LLM with LoRA (Low-Rank Adaptation) from scratch (Oct 2023)☆32Jul 30, 2025Updated 7 months ago
- The Triton TensorRT-LLM Backend☆926Updated this week
- Modeling tool like DBT to use SQL Alchemy core with a DataFrame interface like☆11Apr 23, 2023Updated 2 years ago
- ☆31Nov 21, 2022Updated 3 years ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆70Nov 17, 2025Updated 4 months ago
- bev_lane_det with lower resolution☆10Sep 1, 2023Updated 2 years ago
- A Simple PyTorch Lightning implementation of Masked Autoencoder☆16Jun 29, 2023Updated 2 years ago
- A Movie Recommendation Engine that uses collaborative and content based filtering to suggest movies.☆44Mar 8, 2023Updated 3 years ago
- ☆12Jan 20, 2026Updated 2 months ago
- This is an NVIDIA AI Workbench example project that demonstrates an end-to-end model development workflow using Llamafactory.☆72Oct 17, 2024Updated last year
- Copilot with deepseek and more...☆13Mar 7, 2025Updated last year
- PyTorch Static Quantization Example☆41Apr 29, 2021Updated 4 years ago
- A general 2-8 bits quantization toolbox with GPTQ/AWQ/HQQ/VPTQ, and export to onnx/onnx-runtime easily.☆186Mar 4, 2026Updated 2 weeks ago
- 🍳🚀 CookFast is a free AI tool that writes essential product documents (like Requirements Docs & Application Flows) from your idea, help…☆14Dec 19, 2025Updated 3 months ago
- AI Search engine☆13Sep 24, 2025Updated 5 months ago
- ComfyUI Workflows☆10Sep 27, 2025Updated 5 months ago
- Quick access to any large language model from your browser.☆10Feb 16, 2026Updated last month
- Pytorch implementation of BRECQ, ICLR 2021☆292Aug 1, 2021Updated 4 years ago