a simple general program language
☆100Feb 2, 2026Updated last month
Alternatives and similar repositories for prajna
Users that are interested in prajna are comparing it to the libraries listed below
Sorting:
- a tensor computing compiler based tile programming for gpu, cpu or tpu☆45Feb 2, 2026Updated last month
- GPTQ inference TVM kernel☆40Apr 25, 2024Updated last year
- ☆17Jan 1, 2024Updated 2 years ago
- A toolkit for developers to simplify the transformation of nn.Module instances. It's now corresponding to Pytorch.fx.☆13Apr 7, 2023Updated 2 years ago
- ☆11Dec 26, 2025Updated 2 months ago
- ☆20Aug 11, 2022Updated 3 years ago
- ☆12Sep 1, 2023Updated 2 years ago
- A simple neural network inference framework☆25Aug 1, 2023Updated 2 years ago
- IntLLaMA: A fast and light quantization solution for LLaMA☆18Jul 21, 2023Updated 2 years ago
- ☆15Apr 15, 2022Updated 3 years ago
- A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer☆95Feb 20, 2026Updated last month
- Conversions to MLIR EmitC☆135Dec 12, 2024Updated last year
- MIAOW2.0 FPGA implementable design☆12Oct 18, 2017Updated 8 years ago
- KCL Multiple Language Bindings including Rust, Go, Python, Java, Kotlin, .NET, Swift, Lua, Node.js, Zig, C, C++, WASM, etc.☆19Mar 2, 2026Updated 2 weeks ago
- ☆19Oct 7, 2025Updated 5 months ago
- ☆40Feb 28, 2020Updated 6 years ago
- a c++/cuda template library for tensor lazy evaluation☆165May 8, 2023Updated 2 years ago
- Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.☆43Feb 27, 2025Updated last year
- 自建 chisel 工程模板☆14Jul 19, 2023Updated 2 years ago
- OneFlow->ONNX☆43Apr 19, 2023Updated 2 years ago
- PolyLib official git.☆11Jan 27, 2026Updated last month
- This repository contains companion software for the Colfax Research paper "Categorical Foundations for CuTe Layouts".☆116Sep 24, 2025Updated 5 months ago
- This is a demo how to write a high performance convolution run on apple silicon☆57Feb 8, 2022Updated 4 years ago
- triton for dsa☆60Updated this week
- ☆41Mar 31, 2022Updated 3 years ago
- Musings in GEMM (General Matrix Multiplication)☆14Dec 14, 2025Updated 3 months ago
- 记录阅读各类paper的想法笔记(关注体系结构,机器学习系统,深度学习,计算机视觉)☆25Oct 25, 2019Updated 6 years ago
- The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github…☆21Dec 3, 2020Updated 5 years ago
- RustTalk 节目策划☆17Jun 7, 2025Updated 9 months ago
- Implement Flash Attention using Cute.☆102Dec 17, 2024Updated last year
- SpInfer: Leveraging Low-Level Sparsity for Efficient Large Language Model Inference on GPUs☆62Mar 25, 2025Updated 11 months ago
- ncnn和pnnx格式编辑器☆137Oct 7, 2024Updated last year
- Debug print operator for cudagraph debugging☆14Aug 2, 2024Updated last year
- rdma新手优化教程,基于verbs和rdmacm,用于高性能计算与分离式内存系统☆15Sep 30, 2024Updated last year
- A LR(1) parser generator targeting C++17.☆13Jul 8, 2020Updated 5 years ago
- Benchmark tests supporting the TiledCUDA library.☆18Nov 19, 2024Updated last year
- A model compilation solution for various hardware☆467Aug 20, 2025Updated 7 months ago
- ☆29Oct 6, 2021Updated 4 years ago
- A super tiny RISC-V emulator that is able to run xv6.☆76Aug 16, 2022Updated 3 years ago