Transformer related optimization, including BERT, GPT
☆17Jul 29, 2023Updated 2 years ago
Alternatives and similar repositories for FasterTransformer
Users that are interested in FasterTransformer are comparing it to the libraries listed below
Sorting:
- Transformer related optimization, including BERT, GPT☆39Feb 10, 2023Updated 3 years ago
- Revision of official yolov7-pose to support custom dataset for keypoint detection☆11Nov 12, 2023Updated 2 years ago
- 基于seq2edit (Gector) 的中文文本纠错。☆29Nov 15, 2022Updated 3 years ago
- [ICML 2025] RocketKV: Accelerating Long-Context LLM Inference via Two-Stage KV Cache Compression☆33Aug 7, 2025Updated 6 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆12Nov 14, 2025Updated 3 months ago
- Use yolov5 to realize the road occupation operation and vehicle parking violation detection in urban streets, and can independently delin…☆12Jan 2, 2023Updated 3 years ago
- Vision-Language Models Toolbox: Your all-in-one solution for multimodal research and experimentation☆12Feb 16, 2025Updated last year
- optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052☆478Mar 15, 2024Updated last year
- Zipkin client for asgi. Compatible with Starlette Framework and Jaeger tracing server☆10Apr 21, 2024Updated last year
- Performance tests for multinode NGC.Ready certification☆15Jan 28, 2026Updated last month
- ☆22Dec 23, 2025Updated 2 months ago
- ☆11Sep 4, 2022Updated 3 years ago
- ☆23Jun 19, 2025Updated 8 months ago
- ☆22Dec 11, 2025Updated 2 months ago
- Active Learning with Partial Feedback, ICLR 2019☆11Apr 27, 2020Updated 5 years ago
- 使用 cutlass 实现 flash-attention 精简版,具有教学意义☆57Aug 12, 2024Updated last year
- ☆16Feb 18, 2025Updated last year
- istio http load balance☆10Aug 12, 2019Updated 6 years ago
- This code is for converting COCO json annotations to YOLO txt format (which both are common in object detection projects).☆10Feb 19, 2024Updated 2 years ago
- Compress BiSeNet with Structure Knowledge Distillation for Real-time image segmentation on wali-TX2☆11Jul 29, 2020Updated 5 years ago
- ☆11May 20, 2022Updated 3 years ago
- 机器学习实验 - 线性回归 - 预测连续值☆11Aug 11, 2017Updated 8 years ago
- A simplified implementation inspired by Cline☆10Mar 11, 2025Updated 11 months ago
- ☆13May 25, 2023Updated 2 years ago
- ☆10Aug 18, 2023Updated 2 years ago
- 基于 Redis 官方分布式锁文章的 Python 实现☆10Jan 16, 2021Updated 5 years ago
- ☆13May 17, 2025Updated 9 months ago
- Web app for makeup transfer using Stable Diffusion☆10Sep 11, 2023Updated 2 years ago
- Image Text Segmentation using FAST corner detection and DBSCAN clustering with k-d tree data structure☆14Feb 27, 2019Updated 7 years ago
- [AAAI 2026] Official Code for VQAThinker: Exploring Generalizable and Explainable Video Quality Assessment via Reinforcement Learning☆19Nov 28, 2025Updated 3 months ago
- a fast and customizable CUDA int4 tensor core gemm☆15Aug 2, 2024Updated last year
- Remote sensing labwork☆12Feb 27, 2018Updated 8 years ago
- Official Code for "Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning" (ICLR 2025)☆12Mar 6, 2025Updated 11 months ago
- ☆413Nov 11, 2023Updated 2 years ago
- RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.☆1,051Updated this week
- frame is an open source portifolio builder for developers where developers can add and manage their information, project, articles and mo…☆14Dec 24, 2024Updated last year
- ToolPlanner: A Tool Augmented LLM for Multi Granularity Instructions with Path Planning and Feedback☆18Dec 3, 2024Updated last year
- [ICLR 2023] PyTorch code for DFPC: Data flow driven pruning of coupled channels without data.☆15Aug 25, 2023Updated 2 years ago
- AI Router☆14Aug 1, 2024Updated last year