dthuerck / culipLinks
Code for the culip ("CUda for Linear and Integer Programming") project, containing GPU primitives for linear algebra, linear optimization and (someday) integer optimization.
☆19Updated 6 years ago
Alternatives and similar repositories for culip
Users that are interested in culip are comparing it to the libraries listed below
Sorting:
- Some microbenchmarks and design docs before commencement☆12Updated 4 years ago
- Loop Nest - Linear algebra compiler and code generator.☆22Updated 2 years ago
- A lightweight, user-friendly data-plane for LLM training.☆16Updated last month
- benchmarking some transformer deployments☆26Updated 2 years ago
- JAX implementation of "Fine-Tuning Language Models with Just Forward Passes"☆19Updated last year
- Solver for Unconstrained Binary Quadratic Optimization (UBQO, BQO, QUBO) and Max 2-SAT, based on semidefinite relaxation with constraint …☆15Updated 2 years ago
- FlexAttention w/ FlashAttention3 Support☆26Updated 8 months ago
- Event-Triggered Communication in Parallel Machine Learning☆28Updated 3 years ago
- ☆11Updated 3 years ago
- "Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices", official implementation☆29Updated 4 months ago
- A collection of reproducible inference engine benchmarks☆31Updated last month
- Artifacts for SOSP'19 paper Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions☆21Updated 3 years ago
- code for paper "Accessing higher dimensions for unsupervised word translation"☆21Updated last year
- Code for our ICLR Trustworthy ML 2020 workshop paper "Improved Image Wasserstein Attacks and Defenses"☆14Updated 5 years ago
- ☆28Updated 4 months ago
- Benchmarks to capture important workloads.☆31Updated 4 months ago
- Minimal C++ implementation of GPT2☆40Updated last year
- Compression for Foundation Models☆31Updated 2 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆19Updated 2 years ago
- Development repository for integrating FlexFlow (A distributed deep learning framework that supports flexible parallelization strategies)…☆29Updated 3 years ago
- A Learnable LSH Framework for Efficient NN Training☆31Updated 3 years ago
- Source-to-Source Debuggable Derivatives in Pure Python☆15Updated last year
- No-GIL Python environment featuring NVIDIA Deep Learning libraries.☆60Updated last month
- Training hybrid models for dummies.☆21Updated 4 months ago
- Input (scripts, etc.) and output (scripts, performance results, etc.) for Gunrock and other graph engines☆10Updated last year
- A tracing JIT compiler for PyTorch☆13Updated 3 years ago
- ☆26Updated 2 years ago
- Official Implementation of "CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks"☆19Updated this week
- Code for Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture (accepted by PVLDB).The outdated wr…☆9Updated last year
- Fast and vectorizable algorithms for searching in a vector of sorted floating point numbers☆142Updated 5 months ago