☆25Nov 10, 2025Updated 4 months ago
Alternatives and similar repositories for popcorn
Users that are interested in popcorn are comparing it to the libraries listed below
Sorting:
- Write a fast kernel and see how you compare against the best humans and AI on gpumode.com☆88Updated this week
- ☆20Mar 3, 2026Updated 2 weeks ago
- Simple starter CMake project that uses NVBench.☆16May 6, 2025Updated 10 months ago
- ☆14Updated this week
- 速云梯最新官网地址&优惠码☆14Jan 9, 2026Updated 2 months ago
- A library containing general purpose Python utils.☆14Feb 22, 2023Updated 3 years ago
- ☆12Oct 19, 2014Updated 11 years ago
- ☆11Nov 13, 2020Updated 5 years ago
- Code for "Adaptive Self-improvement LLM Agentic System for ML Library Development" (ICML 2025)☆15Jan 6, 2026Updated 2 months ago
- ☆22Aug 23, 2022Updated 3 years ago
- Video retrieval from query images☆11Oct 10, 2017Updated 8 years ago
- ☆13Dec 3, 2021Updated 4 years ago
- ☆10Aug 11, 2022Updated 3 years ago
- multimodal anomaly detection☆14Jan 17, 2021Updated 5 years ago
- Compile time fixed point scalars and n-dimensional arrays for C++17.☆12Feb 18, 2018Updated 8 years ago
- CUDA GPU Benchmark☆37Jan 31, 2025Updated last year
- ☆10Jul 28, 2021Updated 4 years ago
- Classifying Relations by Ranking with Convolutional Neural Networks☆12May 22, 2019Updated 6 years ago
- DLBlas: clean and efficient kernels☆35Updated this week
- PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations☆17Apr 25, 2020Updated 5 years ago
- Alex Graves' Adaptive Computation Time in PyTorch☆14Jan 9, 2018Updated 8 years ago
- iMLBench is a machine learning benchmark suite targeting CPU-GPU integrated architectures.☆11May 29, 2021Updated 4 years ago
- Transactional memory (mostly Intel® TSX) experiments☆14May 3, 2014Updated 11 years ago
- -☆11Nov 21, 2020Updated 5 years ago
- Row-wise block scaling for fp8 quantization matrix multiplication. Solution to GPU mode AMD challenge.☆18Feb 9, 2026Updated last month
- Leverage the Intel® Distribution of OpenVINO™ Toolkit to fast-track development of high-performance computer vision and deep learning inf…☆10Jul 28, 2020Updated 5 years ago
- ☆12Oct 22, 2019Updated 6 years ago
- C++ intrusive container templates. Abstract node links, no use of new/delete.☆10Mar 5, 2026Updated 2 weeks ago
- Topology Aware Task Mapping Tool☆14Jul 27, 2016Updated 9 years ago
- Korean phoneme dictionary generator for training Montreal Forced Aligner (MFA)☆13Feb 27, 2021Updated 5 years ago
- Chinese Version of ACL 2020 PC Blogs (ACL 2020程序委员会博文中文版)☆15Apr 15, 2020Updated 5 years ago
- Repository for answers for exercises in Programming Massively Parallel Processors book☆16Aug 10, 2024Updated last year
- Triton‑style kernel toolkit for MLX plus a small upstream incubator: prototype, benchmark, and upstream fusions for Apple Silicon☆40Mar 3, 2026Updated 2 weeks ago
- FlashInfer Bench @ MLSys 2026: Building AI agents to write high performance GPU kernels☆150Mar 13, 2026Updated last week
- OpenVec enables portable explicit vectorization on different hardware architectures☆14Oct 23, 2023Updated 2 years ago
- An example platform daemon in Rust; written for Mastering Embedded Linux☆12May 8, 2020Updated 5 years ago
- Investigating Cultural Alignment of Large Language Models☆13Aug 14, 2024Updated last year
- AArch64cryptolib is a from scratch implementation of cryptographic primitives aiming for optimal performance on Arm A-class cores☆43May 27, 2025Updated 9 months ago
- Build a TensorFlow Lite based computer vision emoji input device with OpenMV 📷 → ✋ 👎 👍 👊☆12Nov 28, 2022Updated 3 years ago