☆39Mar 14, 2024Updated 2 years ago
Alternatives and similar repositories for nomad-dist
Users that are interested in nomad-dist are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆17Jul 24, 2023Updated 2 years ago
- This adds partial support of AVX2 and AVX-512 to gem5.☆15Dec 19, 2023Updated 2 years ago
- Residual vector quantization for KV cache compression in large language model☆12Oct 22, 2024Updated last year
- ☆20Sep 28, 2024Updated last year
- ☆82Apr 1, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Differentiable Clustering with Perturbed Random Forests, NeurIPS2023☆13Oct 16, 2023Updated 2 years ago
- Tools and APIs to develop weavers for the LARA language (LARA Compiler, LARA Interpreter, Weaver Generator, etc...)☆16Mar 18, 2026Updated last month
- ☆33Apr 2, 2025Updated last year
- [HPCA 2026] A GPU-optimized system for efficient long-context LLMs decoding with low-bit KV cache.☆86Dec 18, 2025Updated 4 months ago
- Starlight: A Kernel Optimizer for GPU Processing☆16Jan 10, 2024Updated 2 years ago
- Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.☆74Sep 8, 2024Updated last year
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆32Apr 2, 2025Updated last year
- ☆14Nov 20, 2022Updated 3 years ago
- Multi-branch model for concurrent execution☆18Jun 27, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- The official implementation of the DAC 2024 paper GQA-LUT☆22Dec 20, 2024Updated last year
- How to plot for papers, slides, demos, etc.☆10Apr 7, 2022Updated 4 years ago
- ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization☆113Oct 15, 2024Updated last year
- Opara is a lightweight and resource-aware DNN Operator parallel scheduling framework to accelerate the execution of DNN inference on GPUs…☆23Dec 19, 2024Updated last year
- Code for the paper: https://arxiv.org/pdf/2309.06979.pdf☆21Jul 29, 2024Updated last year
- ☆23Dec 23, 2025Updated 4 months ago
- Benchmark tests supporting the TiledCUDA library.☆18Nov 19, 2024Updated last year
- Custom BLAS and LAPACK Cross-Compilation Framework for RISC-V☆19Apr 26, 2020Updated 6 years ago
- A Data Science pipeline for Algorithmic Trading: A comparative study in applications to Finance and cryptoeconomics☆14Jul 1, 2022Updated 3 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Reproducible code for Augmentation paper☆17Jan 23, 2019Updated 7 years ago
- Here is the repo for public scripts.☆12Jul 16, 2022Updated 3 years ago
- ☆14Aug 25, 2021Updated 4 years ago
- ☆28Feb 23, 2026Updated 2 months ago
- libsmctrl论文的复现,添加了python端接口,可以在python端灵活调用接口来分配计算资源☆12May 21, 2024Updated last year
- Accelerated in CUDA☆11Oct 28, 2022Updated 3 years ago
- Low-bit LLM inference on CPU/NPU with lookup table☆953Jun 5, 2025Updated 10 months ago
- An official lightweight library for the RaBitQ algorithm and its applications in vector search.☆194Apr 13, 2026Updated 2 weeks ago
- ☆72Mar 26, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆15May 23, 2024Updated last year
- The official site of the CVPR 2022 Affine Correspondences and Their Applications tutorial☆11Jan 17, 2023Updated 3 years ago
- Slowdown prediction module of Echo: Simulating Distributed Training at Scale☆13May 17, 2025Updated 11 months ago
- FractalTensor is a programming framework that introduces a novel approach to organizing data in deep neural networks (DNNs) as a list of …☆31Dec 21, 2024Updated last year
- This is the official code for CoRL 2022 "Robustness Certification of Visual Perception Models via Camera Motion Smoothing"☆11Apr 5, 2023Updated 3 years ago
- Official code of MoSA (Mixture of Sparse Adapters).☆13Dec 14, 2023Updated 2 years ago
- TileGraph is an experimental DNN compiler that utilizes static code generation and kernel fusion techniques.☆11Sep 18, 2024Updated last year