bigeagle / picoGPT
☆37Updated 2 years ago
Alternatives and similar repositories for picoGPT:
Users that are interested in picoGPT are comparing it to the libraries listed below
- Efficient inference of large language models.☆146Updated 3 months ago
- Programming exercises for kids (no prior programming experience required)☆14Updated 8 months ago
- my dotfiles..☆61Updated this week
- This is a demo how to write a high performance convolution run on apple silicon☆54Updated 3 years ago
- ONNX Command-Line Toolbox☆35Updated 5 months ago
- ☆124Updated last year
- ☆51Updated 2 weeks ago
- GPTQ inference TVM kernel☆39Updated 11 months ago
- Benchmark your NCNN models on 3DS(or crash)☆9Updated 11 months ago
- An easy way to run, test, benchmark and tune OpenCL kernel files☆23Updated last year
- IntLLaMA: A fast and light quantization solution for LLaMA☆18Updated last year
- ☆11Updated 3 years ago
- Artifacts for SOSP'19 paper Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions☆21Updated 2 years ago
- ☆12Updated 2 years ago
- ☆22Updated 5 years ago
- Summary of system papers/frameworks/codes/tools on training or serving large model☆56Updated last year
- An April fools joke, a llvm backend to CMake☆47Updated 3 years ago
- ☆22Updated 3 years ago
- No-GIL Python environment featuring NVIDIA Deep Learning libraries.☆53Updated 3 weeks ago
- Static analysis framework for analyzing programs written in TVM's Relay IR.☆28Updated 5 years ago
- An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.☆51Updated 8 months ago
- Odysseus: Playground of LLM Sequence Parallelism☆66Updated 9 months ago
- Standalone Flash Attention v2 kernel without libtorch dependency☆106Updated 6 months ago
- My tests and experiments with some popular dl frameworks.☆12Updated last month
- TensorFlow and TVM integration☆38Updated 4 years ago
- Inference TinyLlama models on ncnn☆24Updated last year
- A library for syntactically rewriting Python programs, pronounced (sinner).☆70Updated 3 years ago
- InsNet Runs Instance-dependent Neural Networks with Padding-free Dynamic Batching.☆66Updated 3 years ago
- We invite you to visit and follow our new repository at https://github.com/microsoft/TileFusion. TiledCUDA is a highly efficient kernel …☆179Updated 2 months ago
- System for automated integration of deep learning backends.☆48Updated 2 years ago