Pytorch routines for (Ker)nel (Mac)hines
☆12Oct 10, 2025Updated 7 months ago
Alternatives and similar repositories for kermac
Users that are interested in kermac are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆67Apr 12, 2025Updated last year
- EigenPro Iteration in PyTorch☆19Jan 9, 2024Updated 2 years ago
- ☆23Jan 25, 2024Updated 2 years ago
- ☆19Nov 11, 2025Updated 6 months ago
- Vector Approximate Message Passing (VAMP)☆11May 14, 2023Updated 3 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Example to build PyTorch CUDA extension using CMake (with pybind11 and scikit-build)☆12May 26, 2020Updated 6 years ago
- Code for lin-RFM used for sparse recovery tasks☆17Mar 13, 2025Updated last year
- A Top-Down Profiler for GPU Applications☆22Feb 29, 2024Updated 2 years ago
- Parallel Self-Adjusting Computation☆16Jul 5, 2021Updated 4 years ago
- LaTeX template files for dissertations and theses formatted according to UCLA graduate division's requirements☆15Jul 11, 2022Updated 3 years ago
- carbon.now.sh python module☆11Oct 10, 2021Updated 4 years ago
- Code for the paper "Randomly pivoted Cholesky: Practical approximation of a kernel matrix with few entry evaluations"☆35Dec 4, 2025Updated 6 months ago
- Chess engine in C++☆10Updated this week
- Benchmarking Optimizers for LLM Pretraining☆60May 3, 2026Updated last month
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Automatic differentiation for Triton Kernels☆29Aug 12, 2025Updated 9 months ago
- CUTLASS and CuTe Examples☆136Nov 30, 2025Updated 6 months ago
- An end-to-end MATLAB toolkit for completely unsupervised Speaker Diarization using state-of-the-art algorithms.☆15Dec 22, 2015Updated 10 years ago
- Energy-Aware Neural Architecture Optimization with Fast Splitting Steepest Descent☆14Feb 6, 2020Updated 6 years ago
- ☆16Dec 1, 2024Updated last year
- (ICLR 2025 Spotlight) TabReD: Analyzing Pitfalls and Filling the Gaps in Tabular Deep Learning Benchmarks☆90Jun 4, 2025Updated last year
- An implementation of the Hopfield Network using PyTorch, leveraging CUDA for linear algebra speedup☆15Nov 19, 2025Updated 6 months ago
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated 2 years ago
- Benchmarks of different devices I have come across☆43Aug 28, 2025Updated 9 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆11Dec 30, 2024Updated last year
- ☆29Jan 17, 2025Updated last year
- A chrome extension to embrace your Dark Side!☆11Feb 14, 2021Updated 5 years ago
- Simple MoE - Day 17 of 365 Days of Repos☆19Jun 2, 2026Updated last week
- 操作系统进程管理项目之电梯调度,写的比较简单☆16May 24, 2021Updated 5 years ago
- A project designed to build and render a full Minecraft crafting tree.☆10Aug 10, 2021Updated 4 years ago
- Exploring the minimal architecture required for coherent English language generation.☆13May 27, 2026Updated last week
- Blindspots in LLMs I've noticed while AI coding. Sonnet family emphasis.☆13Mar 20, 2025Updated last year
- generate an ELF64 binary from scratch☆17Jan 18, 2019Updated 7 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- diffusers with search engine☆12Jan 13, 2026Updated 4 months ago
- A bunch of kernels that might make stuff slower 😉☆90Updated this week
- Data ingestion and curation tools☆18Dec 13, 2024Updated last year
- 6,080-param transformer achieving 100% accuracy on 10-digit addition. Trained from scratch in 10 minutes.☆22Feb 19, 2026Updated 3 months ago
- ☆14Mar 8, 2025Updated last year
- ☆13May 14, 2025Updated last year
- Framework for Algorithmic Correctness Testing of Operators☆16Mar 9, 2026Updated 3 months ago