second-state / meetupsLinks
☆70Updated 4 months ago
Alternatives and similar repositories for meetups
Users that are interested in meetups are comparing it to the libraries listed below
Sorting:
- ☆128Updated 8 months ago
- ☆23Updated 7 months ago
- ☆79Updated last year
- Transformer related optimization, including BERT, GPT☆17Updated 2 years ago
- LLM Inference benchmark☆426Updated last year
- CPM.cu is a lightweight, high-performance CUDA implementation for LLMs, optimized for end-device inference and featuring cutting-edge tec…☆173Updated 3 weeks ago
- A prefill & decode disaggregated LLM serving framework with shared GPU memory and fine-grained compute isolation.☆104Updated 3 months ago
- ☆54Updated last week
- Compare different hardware platforms via the Roofline Model for LLM inference tasks.☆111Updated last year
- A high performance, high expansion, easy to use framework for AI application. 为AI应用的开发者提供一套统一的高性能、易用的编程框架,快速基于AI全栈服务、开发跨端边云的AI行业应用,支持GPU,…☆157Updated last year
- Tutorials for writing high-performance GPU operators in AI frameworks.☆130Updated 2 years ago
- DeepSparkHub selects hundreds of application algorithms and models, covering various fields of AI and general-purpose computing, to suppo…☆65Updated this week
- ☆140Updated last year
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆263Updated 2 weeks ago
- PyTorch distributed training acceleration framework☆51Updated last week
- Efficient, Flexible, and Highly Fault-Tolerant Model Service Management Based on SGLang☆55Updated 9 months ago
- ☆24Updated last year
- Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).☆246Updated last year
- ☆59Updated 9 months ago
- ☆220Updated 2 years ago
- ☆58Updated 5 years ago
- ☆84Updated last week
- AI Accelerator Benchmark focuses on evaluating AI Accelerators from a practical production perspective, including the ease of use and ver…☆257Updated this week
- optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052☆475Updated last year
- OneFlow->ONNX☆43Updated 2 years ago
- llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deploy…☆86Updated last year
- DeepLearning Framework Performance Profiling Toolkit☆287Updated 3 years ago
- FlagTree is a unified compiler for multiple AI chips, which is forked from triton-lang/triton.☆72Updated this week
- Transformer related optimization, including BERT, GPT☆39Updated 2 years ago
- Free resource for the book AI Compiler Development Guide☆46Updated 2 years ago