L1aoXingyu / llm-infer-benchView external linksLinks
☆12Sep 1, 2023Updated 2 years ago
Alternatives and similar repositories for llm-infer-bench
Users that are interested in llm-infer-bench are comparing it to the libraries listed below
Sorting:
- codes for Neural Architecture Ranker and detailed cell information datasets based on NAS-Bench series☆12Jul 11, 2022Updated 3 years ago
- ☆13Mar 27, 2023Updated 2 years ago
- Benchmarking Attention Mechanism in Vision Transformers.☆20Oct 10, 2022Updated 3 years ago
- ☆120Apr 22, 2024Updated last year
- study of cutlass☆22Nov 10, 2024Updated last year
- Source code for the paper "LongGenBench: Long-context Generation Benchmark"☆24Oct 8, 2024Updated last year
- Implementation of PGONAS for CVPR22W and RD-NAS for ICASSP23☆23Apr 25, 2023Updated 2 years ago
- ☆26Oct 2, 2023Updated 2 years ago
- ☆23Jan 21, 2024Updated 2 years ago
- ☆29Oct 3, 2022Updated 3 years ago
- ☆149Oct 19, 2022Updated 3 years ago
- TVMScript kernel for deformable attention☆25Dec 15, 2021Updated 4 years ago
- ☆25Jun 24, 2021Updated 4 years ago
- This repository provides an improved LLamaGen Model, fine-tuned on 500,000 high-quality images, each accompanied by over 300 token prompt…☆30Oct 21, 2024Updated last year
- Agentic Learning Powered by AWorld☆88Feb 7, 2026Updated last week
- [ICCV-2023] EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization☆28Dec 6, 2023Updated 2 years ago
- ☆38Oct 12, 2024Updated last year
- Document the demo and a series of documents for learning the diffusion model.☆42Jun 29, 2023Updated 2 years ago
- An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization☆172Nov 26, 2025Updated 2 months ago
- [ICML24] Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for LLMs☆98Nov 25, 2024Updated last year
- Supporting code for "LLMs for your iPhone: Whole-Tensor 4 Bit Quantization"☆11Mar 31, 2024Updated last year
- ☆10Aug 15, 2022Updated 3 years ago
- ☆10Jul 16, 2023Updated 2 years ago
- A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer☆96Sep 13, 2025Updated 5 months ago
- ☆43Oct 31, 2024Updated last year
- Auto choose the fastest CDN host for swagger-ui in /docs.☆19Jan 12, 2026Updated last month
- Symbolic Graphics Programming with Large Language Models☆37Sep 14, 2025Updated 5 months ago
- ☆12Mar 13, 2023Updated 2 years ago
- ChineseCLIP using online learning☆13Nov 7, 2022Updated 3 years ago
- Fastai+PyTorch implementation of sparse model training methods (SET, SNFS, RigL) + customize-your-own.☆10Oct 20, 2022Updated 3 years ago
- deep-reinforcement-learning-for-grasp☆11Jun 20, 2019Updated 6 years ago
- Code for the ICRA2018 paper "Trajectory-Optimized Sensing for Active Search of Tissue Abnormalities in Robotic Surgery"☆11May 22, 2018Updated 7 years ago
- Open source simulator for porous media flow☆14Oct 15, 2022Updated 3 years ago
- Common template for pytorch project. Easy to extent and modify for new project.☆13Dec 13, 2022Updated 3 years ago
- a simple API to use CUPTI☆11Aug 19, 2025Updated 5 months ago
- Enhanced version of original AutoGPTQ (https://github.com/PanQiWei/AutoGPTQ).☆10Nov 2, 2023Updated 2 years ago
- Persistent dense gemm for Hopper in `CuTeDSL`☆15Aug 9, 2025Updated 6 months ago
- Nano vLLM☆12Jun 26, 2025Updated 7 months ago
- Vectorgraph Image Painter☆12Mar 24, 2019Updated 6 years ago