abacusai / gh200-llmView external linksLinks
Docker image NVIDIA GH200 machines - optimized for vllm serving and hf trainer finetuning
☆53Feb 22, 2025Updated 11 months ago
Alternatives and similar repositories for gh200-llm
Users that are interested in gh200-llm are comparing it to the libraries listed below
Sorting:
- CPU and GPU tutorial examples☆13Apr 4, 2025Updated 10 months ago
- ☆13Jan 7, 2025Updated last year
- 🚀 LLM inference optimization simulator, modeling compute-bound prefill and memory-bound decode phases.☆13Jul 12, 2025Updated 7 months ago
- A Triton-only attention backend for vLLM☆23Updated this week
- ☆42Jan 24, 2026Updated 3 weeks ago
- Slides and exercises for persistent memory programming tutorial☆14Nov 14, 2022Updated 3 years ago
- ☆19Aug 10, 2024Updated last year
- scalable data movement in Exascale Supercomputers☆17Dec 4, 2025Updated 2 months ago
- ☆18May 19, 2023Updated 2 years ago
- SMT-LIB benchmarks for shape computations from deep learning models in PyTorch☆18Dec 21, 2022Updated 3 years ago
- If CLIP Could Talk: Understanding Vision-Language Model Representations Through Their Preferred Concept Descriptions☆17Apr 4, 2024Updated last year
- ☆71Mar 26, 2025Updated 10 months ago
- GCM Physics written in JAX☆79Updated this week
- ☆60Updated this week
- An HPL-AI implementation for Fugaku☆23Jun 29, 2021Updated 4 years ago
- TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.☆106Jun 28, 2025Updated 7 months ago
- Overcoming the IOTLB Wall for Multi-100-Gbps Linux-based Networking☆24May 16, 2023Updated 2 years ago
- This tool serves as a test harness for different optimization techniques to improve stencil computations performance in shared and distri…☆21Nov 9, 2022Updated 3 years ago
- ☆21Apr 17, 2025Updated 9 months ago
- [NeurIPS 2022] Code for paper "Efficiently Computing Local Lipschitz Constants of Neural Networks via Bound Propagation"☆27Dec 10, 2023Updated 2 years ago
- Official implementation for Training Certifiably Robust Neural Networks with Efficient Local Lipschitz Bounds (NeurIPS, 2021).☆25Sep 4, 2022Updated 3 years ago
- β-CROWN: Efficient Bound Propagation with Per-neuron Split Constraints for Neural Network Verification☆31Nov 9, 2021Updated 4 years ago
- Create and deploy virtual-experiments - co-processing computational workflows☆10Jan 28, 2026Updated 2 weeks ago
- ☆26Dec 3, 2025Updated 2 months ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆32Sep 19, 2025Updated 4 months ago
- An implementation of HPL-AI Mixed-Precision Benchmark based on hpl-2.3☆29May 30, 2021Updated 4 years ago
- Learning Security Classifiers with Verified Global Robustness Properties (CCS'21) https://arxiv.org/pdf/2105.11363.pdf☆28Dec 1, 2021Updated 4 years ago
- Writing FLUX in Triton☆41Sep 22, 2024Updated last year
- ☆54May 5, 2025Updated 9 months ago
- JAX implementation of the Mistral 7b v0.2 model☆35Jul 3, 2024Updated last year
- Hardened Extension of the Adversarial Robustness Toolbox (HEART) supports assessment of adversarial AI vulnerabilities in Test & Evaluati…☆15Sep 18, 2025Updated 4 months ago
- 详细双语注释版word2vec源码,well-annotated word2vec☆10Oct 3, 2021Updated 4 years ago
- ext_mpi_collectives☆11Apr 1, 2025Updated 10 months ago
- Port of the LLVM compiler infrastructure to the time-predictable processor Patmos☆15Apr 2, 2025Updated 10 months ago
- PARADIS, a lightweight and flexible weather forecast model that tries to Keep It Simple.☆25Feb 4, 2026Updated last week
- Memory Topology for GPUs☆17Dec 9, 2025Updated 2 months ago
- ☆38May 20, 2021Updated 4 years ago
- HPCG benchmark based on ROCm platform☆39Feb 3, 2026Updated last week
- extensible collectives library in triton☆95Mar 31, 2025Updated 10 months ago