cchan / fp8_mulLinks
A tiny FP8 multiplication unit written in Verilog. TinyTapeout 2 submission.
☆14Updated 2 years ago
Alternatives and similar repositories for fp8_mul
Users that are interested in fp8_mul are comparing it to the libraries listed below
Sorting:
- A lightweight core for the CV32E40 implementing the RISC-V vector extension specification. (v0.8)☆35Updated 4 years ago
- Wrappers for open source FPU hardware implementations.☆34Updated last year
- Heterogeneous Cluster Interconnect to bind special-purpose HW accelerators with general-purpose cluster cores☆14Updated this week
- ☆32Updated this week
- A stream to RTL compiler based on MLIR and CIRCT☆15Updated 2 years ago
- Synthesisable SIMT-style RISC-V GPGPU☆41Updated 3 months ago
- General Purpose Graphics Processing Unit (GPGPU) IP Core☆11Updated 11 years ago
- The Next-gen Language & Compiler Powering Efficient Hardware Design☆30Updated 9 months ago
- TensorCore Vector Processor for Deep Learning - Google Summer of Code Project☆22Updated 4 years ago
- FPGA acceleration of arbitrary precision floating point computations.☆40Updated 3 years ago
- Pulp virtual platform☆24Updated 3 months ago
- Various examples for Chisel HDL☆29Updated 3 years ago
- Custom extensions to the RISC-V isa simulator for the UCB-BAR ESP project☆17Updated 2 years ago
- ☆79Updated last week
- Learn NVDLA by SOMNIA☆43Updated 5 years ago
- PDPU: An Open-Source Posit Dot-Product Unit for Deep Learning Applications☆43Updated 2 years ago
- 2-8bit weights, 8-bit activations flexible Neural Processing Engine for PULP clusters☆27Updated this week
- The official NaplesPU hardware code repository☆19Updated 6 years ago
- An LLVM pass to prove that an II works for the given loop for Vitis HLS☆11Updated 4 years ago
- Virtualized Accelerator Orchestration for Multi-Tenant Workloads☆19Updated 11 months ago
- Meta-Repository for Bespoke Silicon Group's Manycore Architecture (A.K.A HammerBlade)☆43Updated 4 months ago
- FleetRec: Large-Scale Recommendation Inference on Hybrid GPU-FPGA Clusters☆17Updated 4 years ago
- ☆36Updated 4 years ago
- Example for running IREE in a bare-metal Arm environment.☆39Updated 2 months ago
- Lake is a framework for generating synthesizable memory modules from a high-level behavioral specification and widely-available memory ma…☆22Updated this week
- xkDLA:XinKai Deep Learning Accelerator (RTL)☆39Updated last year
- A high-efficiency system-on-chip for floating-point compute workloads.☆43Updated 9 months ago
- HeteroHalide: From Image Processing DSL to Efficient FPGA Acceleration☆15Updated 5 years ago
- Wraps the NVDLA project for Chipyard integration☆21Updated last month
- Proposed RISC-V Composable Custom Extensions Specification☆70Updated 3 months ago