GreenWaves-Technologies / bfloat16Links
bfloat16 dtype for numpy
☆20Updated 2 years ago
Alternatives and similar repositories for bfloat16
Users that are interested in bfloat16 are comparing it to the libraries listed below
Sorting:
- QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX☆164Updated last week
- PyTorch extension for emulating FP8 data formats on standard FP32 Xeon/GPU hardware.☆111Updated 11 months ago
- Sandbox for TVM and playing around!☆22Updated 2 years ago
- A tiny FP8 multiplication unit written in Verilog. TinyTapeout 2 submission.☆14Updated 2 years ago
- ☆163Updated 2 years ago
- Fork of upstream onnxruntime focused on supporting risc-v accelerators☆87Updated 2 years ago
- ☆158Updated 2 years ago
- This project contains a code generator that produces static C NN inference deployment code targeting tiny micro-controllers (TinyML) as r…☆29Updated 4 years ago
- Test suite for probing the numerical behavior of NVIDIA tensor cores☆41Updated last year
- Customized matrix multiplication kernels☆57Updated 3 years ago
- A Deep Learning Framework for the Posit Number System☆30Updated last year
- GPTQ inference TVM kernel☆39Updated last year
- ☆68Updated 2 years ago
- An experimental CPU backend for Triton (https//github.com/openai/triton)☆47Updated 2 months ago
- ☆71Updated 7 months ago
- Converting a deep neural network to integer-only inference in native C via uniform quantization and the fixed-point representation.☆26Updated 3 years ago
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆305Updated last week
- Framework to reduce autotune overhead to zero for well known deployments.☆85Updated last month
- A lightweight, Pythonic, frontend for MLIR☆80Updated 2 years ago
- The Riallto Open Source Project from AMD☆84Updated 6 months ago
- benchmarking some transformer deployments☆26Updated 2 years ago
- [TCAD 2021] Block Convolution: Towards Memory-Efficient Inference of Large-Scale CNNs on FPGA☆17Updated 3 years ago
- ☆11Updated 4 years ago
- Curated content for DNN approximation, acceleration ... with a focus on hardware accelerator and deployment☆27Updated last year
- Parse TFLite models (*.tflite) EASILY with Python. Check the API at https://zhenhuaw.me/tflite/docs/☆102Updated 9 months ago
- An AI accelerator implementation with Xilinx FPGA☆68Updated 9 months ago
- ☆39Updated last year
- Nod.ai 🦈 version of 👻 . You probably want to start at https://github.com/nod-ai/shark for the product and the upstream IREE repository …☆106Updated 10 months ago
- Trying to find out what is the minimal model that can achieve 99% accuracy on MNIST dataset☆27Updated 7 years ago
- ☆50Updated last year