ReaLLMASIC / nanoGPTLinks
The simplest, fastest repository for training/finetuning medium-sized GPTs.
☆31Updated last week
Alternatives and similar repositories for nanoGPT
Users that are interested in nanoGPT are comparing it to the libraries listed below
Sorting:
- A Flexible and Energy Efficient Accelerator For Sparse Convolution Neural Network☆73Updated 3 months ago
- HW Architecture-Mapping Design Space Exploration Framework for Deep Learning Accelerators☆150Updated 2 months ago
- A reading list for SRAM-based Compute-In-Memory (CIM) research.☆66Updated 4 months ago
- A RISC-V BOOM Microarchitecture Power Modeling Framework☆24Updated 2 years ago
- An Open Workflow to Build Custom SoCs and run Deep Models at the Edge☆79Updated 2 weeks ago
- ☆42Updated 8 months ago
- INT8 & FP16 multiplier accumulator (MAC) design with UVM verification completed.☆103Updated 4 years ago
- CHARM: Composing Heterogeneous Accelerators on Heterogeneous SoC Architecture☆143Updated this week
- A SystemVerilog implementation of Row-Stationary dataflow and Hierarchical Mesh Network-on-Chip Architecture based on Eyeriss CNN Acceler…☆160Updated 5 years ago
- A Reconfigurable Accelerator with Data Reordering Support for Low-Cost On-Chip Dataflow Switching☆53Updated 2 months ago
- A Fast, Low-Overhead On-chip Network☆208Updated last week
- tpu-systolic-array-weight-stationary☆24Updated 4 years ago
- ☆111Updated 4 years ago
- verilog实现TPU中的脉动阵列计算卷积的module☆117Updated 3 weeks ago
- ASIC Design Kit for FreePDK45 + Nangate for use with mflowgen☆177Updated 5 years ago
- CATCH 1.0, Initial full release of CATCH cost model.☆14Updated 3 months ago
- ☆166Updated 2 months ago
- RTL Network-on-Chip Router Design in SystemVerilog by Andrea Galimberti, Filippo Testa and Alberto Zeni☆126Updated 7 years ago
- A verilog implementation for Network-on-Chip☆73Updated 7 years ago
- ☆21Updated 3 weeks ago
- Multi-core HW accelerator mapping optimization framework for layer-fused ML workloads.☆54Updated last month
- This is a verilog implementation of 4x4 systolic array multiplier☆54Updated 4 years ago
- 16-bit Adder Multiplier hardware on Digilent Basys 3☆75Updated last year
- eyeriss-chisel3☆40Updated 3 years ago
- An integrated CGRA design framework☆89Updated 2 months ago
- CGRA-Flow is an integrated framework for CGRA compilation, exploration, synthesis, and development.☆129Updated last week
- FPGA based Vision Transformer accelerator (Harvard CS205)☆122Updated 3 months ago
- An open-source benchmark for generating design RTL with natural language☆111Updated 7 months ago
- An open-source parameterizable NPU generator with full-stack multi-target compilation stack for intelligent workloads.☆53Updated 2 months ago
- 32-Bit Algorithms of Floating Point Operations are implemented on Verilog with logic Operations.☆85Updated 6 years ago