vtjnash / atlas-3.10.0

https://github.com/math-atlas/math-atlas -- this is Atlas 3.10.0 edited from the source tar to build with mingw32 (and presumably still most other places, except probably Cygwin cross-compile)

☆10

Alternatives and similar repositories for atlas-3.10.0:

Users that are interested in atlas-3.10.0 are comparing it to the libraries listed below

ColfaxResearch / FALCON
Library for fast image convolution in neural networks on Intel Architecture
☆29Updated 7 years ago
flame / fmm-gen
Generating Families of Practical Fast Matrix Multiplication Algorithms
☆12Updated 7 years ago
asprasad / treebeard
An optimizing compiler for decision tree ensemble inference.
☆17Updated last month
MatthieuCourbariaux / deep-learning-multipliers
Training deep neural networks with low precision multiplications
☆63Updated 9 years ago
dmlc / nnvm-fusion
Kernel Fusion and Runtime Compilation Based on NNVM
☆70Updated 8 years ago
ppwwyyxx / haDNN
Proof-of-Concept CNN in Halide
☆22Updated 8 years ago
arbenson / fast-matmul
Fast matrix multiplication
☆29Updated 3 years ago
HPAC / TTC
TTC: A high-performance Compiler for Tensor Transpositions
☆20Updated 7 years ago
clementfarabet / neuflow
Compiler toolkit for neuFlow.
☆26Updated 11 years ago
ravi-teja-mullapudi / Halide-NN
CNNs in Halide
☆23Updated 9 years ago
polymage-labs / mlirx
MLIRX is now defunct. Please see PolyBlocks - https://docs.polymagelabs.com
☆38Updated last year
onnx / onnx-xla
XLA integration of Open Neural Network Exchange (ONNX)
☆19Updated 6 years ago
andersy005 / tvm-in-action
TVM stack: exploring the incredible explosion of deep-learning frameworks and how to bring them together
☆64Updated 6 years ago
uwsampl / relay-aot
An experimental ahead of time compiler for Relay.
☆50Updated 5 years ago
parsa-epfl / HBFPEmulator
ColTraIn HBFP Training Emulator
☆16Updated 2 years ago
lixiuhong / batched_gemm
☆38Updated 5 years ago
google / mlir-npcomp
npcomp - An aspirational MLIR based numpy compiler
☆51Updated 4 years ago
intel / mklnn
☆10Updated 2 years ago
rdadolf / fathom
Reference workloads for modern deep learning methods.
☆73Updated 2 years ago
adityaiitb / PyProf
A GPU performance profiling tool for PyTorch models
☆22Updated 2 years ago
maltanar / gemmbitserial
Fast matrix multiplication for few-bit integer matrices on CPUs.
☆27Updated 6 years ago
surban / TensorAlgDiff
Automatic Differentiation for Tensor Algebras
☆28Updated 6 years ago
hipacc / hipacc
A domain-specific language and compiler for image processing
☆76Updated 4 years ago
bondhugula / polymage-benchmarks
Base code and optimized code for the benchmarks used in the PolyMage paper published at ASPLOS 2015
☆19Updated 8 years ago
comaniac / epoi
Benchmark PyTorch Custom Operators
☆14Updated last year
OpenHero / im2col
image to column
☆30Updated 10 years ago
wahibium / KFF
Scalable GPU Kernel Fission/Fusion Transformation for Memory-Bound Kernels
☆13Updated 9 years ago
lightsighter / CudaDMA
Emulating DMA Engines on GPUs for Performance and Portability
☆39Updated 9 years ago
pigirons / conv3x3_m1
This is a demo how to write a high performance convolution run on apple silicon
☆54Updated 3 years ago
xmos / ai_tools
AI applications and tools
☆26Updated last week