Customized matrix multiplication kernels
☆57Mar 5, 2022Updated 4 years ago
Alternatives and similar repositories for custom_matmul_kernels
Users that are interested in custom_matmul_kernels are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Accelerating CNN's convolution operation on GPUs by using memory-efficient data access patterns.☆14Dec 8, 2017Updated 8 years ago
- Reinforcement learning modular with pytorch☆11Jan 18, 2021Updated 5 years ago
- ☆12Sep 29, 2021Updated 4 years ago
- Fast Emulation of Approximate DNN Accelerators in PyTorch☆31Feb 23, 2024Updated 2 years ago
- 4th place solution to datafactory challenge by Intermarché.☆12Jun 28, 2021Updated 5 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- All the useful tools I have been using while working in data science for remote sensing☆11Nov 27, 2019Updated 6 years ago
- A nascent Jax-based package for virtual brain modeling.☆14May 5, 2026Updated last month
- 📝 Do you use several commands in your terminal, one after the other? This tool allows you to combine multiple templated bash commands …☆37May 5, 2022Updated 4 years ago
- Approximate layers - TensorFlow extension☆27Apr 14, 2025Updated last year
- Overview of IR/NLP papers covered in my team's reading group.☆10May 5, 2020Updated 6 years ago
- Minimal code to train ELMo models in recent versions of TensorFlow☆14Jun 16, 2026Updated last week
- ☆12Jun 14, 2021Updated 5 years ago
- PolyMage is a domain-specific language and optimizing code generator for auto-parallelisation☆14Jul 15, 2016Updated 9 years ago
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆49Jan 27, 2022Updated 4 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- LLVM/MLIR based compiler instrumentation of AMD GPU kernels☆21Jul 13, 2025Updated 11 months ago
- UIE(Universal Information Extraction) infer by ncnn☆15Sep 22, 2024Updated last year
- Hessian trace estimation using PyTorch and Hutch++☆20Oct 29, 2020Updated 5 years ago
- Implementation of Lie Transformer, Equivariant Self-Attention, in Pytorch☆98Feb 19, 2021Updated 5 years ago
- A study for a custom convolution layer in which the x and y components of an image pixel are added to the kernel inputs.☆12Feb 21, 2020Updated 6 years ago
- Benchmark your NCNN models on 3DS(or crash)☆10Apr 15, 2024Updated 2 years ago
- The Structure and Interpretation of Deep Networks Handbook☆14Dec 14, 2024Updated last year
- Causal Fairness Analysis☆21Apr 16, 2025Updated last year
- A script for PyTorch multi-GPU multi-process testing☆24Apr 29, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Yaae: Yet another autodiff engine (written in Numpy).☆28Jul 6, 2023Updated 2 years ago
- Implementation of Flash Attention in Jax☆228Mar 1, 2024Updated 2 years ago
- Cython/Python bindings of E2LSH by Andoni and Symmetric LSH for MIPS by Neyshabur☆15Feb 10, 2016Updated 10 years ago
- A python library for highly configurable transformers - easing model architecture search and experimentation.☆48Nov 30, 2021Updated 4 years ago
- Universal Python binding for the LMDB 'Lightning' Database☆13Nov 7, 2017Updated 8 years ago
- Neural Network Based Dependency Parsers☆11Jan 14, 2016Updated 10 years ago
- ☆20Feb 12, 2025Updated last year
- LLM-DSE: Searching Accelerator Parameters with LLM Agents☆16May 22, 2025Updated last year
- A GPU FP32 computation method with Tensor Cores.☆27Dec 8, 2025Updated 6 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Make triton easier☆50Jun 12, 2024Updated 2 years ago
- Quantization of Convolutional Neural networks.☆250Aug 5, 2024Updated last year
- Reparameterize your PyTorch modules☆70Dec 31, 2020Updated 5 years ago
- a lightweight transformer library for PyTorch☆71Nov 2, 2021Updated 4 years ago
- Scale-out system monitoring☆24Jun 16, 2026Updated 2 weeks ago
- Code for Solving Black-Box Optimization Challenge via Learning Search Space Partition for Local Bayesian Optimization.☆21Aug 20, 2021Updated 4 years ago
- Code for making #GANterpretations☆23Nov 30, 2020Updated 5 years ago