☆39Mar 14, 2024Updated 2 years ago
Alternatives and similar repositories for nomad-dist
Users that are interested in nomad-dist are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆16Jul 24, 2023Updated 2 years ago
- ☆20Sep 28, 2024Updated last year
- ☆82Apr 1, 2024Updated 2 years ago
- Memory-Bounded GPU Acceleration for Vector Search☆33Dec 29, 2025Updated 4 months ago
- Tools and APIs to develop weavers for the LARA language (LARA Compiler, LARA Interpreter, Weaver Generator, etc...)☆16May 6, 2026Updated 2 weeks ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Starlight: A Kernel Optimizer for GPU Processing☆16Jan 10, 2024Updated 2 years ago
- [HPCA 2026] A GPU-optimized system for efficient long-context LLMs decoding with low-bit KV cache.☆89Updated this week
- Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.☆74Sep 8, 2024Updated last year
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆32Apr 2, 2025Updated last year
- ☆14Nov 20, 2022Updated 3 years ago
- The official implementation of the DAC 2024 paper GQA-LUT☆22Dec 20, 2024Updated last year
- How to plot for papers, slides, demos, etc.☆10Apr 7, 2022Updated 4 years ago
- ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization☆113Oct 15, 2024Updated last year
- Lightning Training strategy for HiveMind☆18Jan 20, 2026Updated 4 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Opara is a lightweight and resource-aware DNN Operator parallel scheduling framework to accelerate the execution of DNN inference on GPUs…☆23Dec 19, 2024Updated last year
- Code for the paper: https://arxiv.org/pdf/2309.06979.pdf☆21Jul 29, 2024Updated last year
- ☆12May 23, 2024Updated last year
- ☆23May 5, 2026Updated 2 weeks ago
- Benchmark tests supporting the TiledCUDA library.☆19Nov 19, 2024Updated last year
- Sample app to test Camera2 API's Multi Camera mode☆17Dec 11, 2018Updated 7 years ago
- Custom BLAS and LAPACK Cross-Compilation Framework for RISC-V☆19Apr 26, 2020Updated 6 years ago
- A Data Science pipeline for Algorithmic Trading: A comparative study in applications to Finance and cryptoeconomics☆14Jul 1, 2022Updated 3 years ago
- Reproducible code for Augmentation paper☆17Jan 23, 2019Updated 7 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Here is the repo for public scripts.☆12Jul 16, 2022Updated 3 years ago
- ☆14Aug 25, 2021Updated 4 years ago
- ☆28Feb 23, 2026Updated 2 months ago
- libsmctrl论文的复现,添加了python端接口,可以在python端灵活调用接口来分配计算资源☆12May 21, 2024Updated last year
- A Python tool for tracking changes in Compute Express Link (CXL) features within the Linux kernel using GitHub API. It supports various o…☆13May 12, 2026Updated last week
- Accelerated in CUDA☆11Oct 28, 2022Updated 3 years ago
- Low-bit LLM inference on CPU/NPU with lookup table☆955Jun 5, 2025Updated 11 months ago
- An official lightweight library for the RaBitQ algorithm and its applications in vector search.☆206Updated this week
- ☆75Mar 26, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- The official site of the CVPR 2022 Affine Correspondences and Their Applications tutorial☆11Jan 17, 2023Updated 3 years ago
- Slowdown prediction module of Echo: Simulating Distributed Training at Scale☆13May 17, 2025Updated last year
- FractalTensor is a programming framework that introduces a novel approach to organizing data in deep neural networks (DNNs) as a list of …☆31Dec 21, 2024Updated last year
- Official code of MoSA (Mixture of Sparse Adapters).☆13Dec 14, 2023Updated 2 years ago
- Yet another Linux distro for RISC-V.☆13Dec 25, 2025Updated 4 months ago
- AI-Linux is a research project that tries to implement an Artificial Intelligence within the kernel of an operating system, In this case,…☆24Aug 22, 2018Updated 7 years ago
- CSiBE☆34Feb 17, 2022Updated 4 years ago