rejunity / tiny-asic-1_58bit-matrix-mulLinks

Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit

☆158

Alternatives and similar repositories for tiny-asic-1_58bit-matrix-mul

Users that are interested in tiny-asic-1_58bit-matrix-mul are comparing it to the libraries listed below

Sorting:

HLSTransform / submission
☆97Updated last year
GATECH-EIC / ShiftAddLLM
ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
☆109Updated 9 months ago
DeepWok / mase
Machine-Learning Accelerator System Exploration Tools
☆171Updated last month
AMDResearch / Riallto
The Riallto Open Source Project from AMD
☆81Updated 3 months ago
KohakuBlueleaf / HakuTPU
An AI accelerator implementation with Xilinx FPGA
☆48Updated 5 months ago
hkust-zhiyao / RTL-Coder
A new LLM solution for RTL code generation, achieving state-of-the-art performance in non-commercial solutions and outperforming GPT-3.5.
☆210Updated 5 months ago
NVlabs / verilog-eval
Verilog evaluation benchmark for large language model
☆288Updated 5 months ago
kachris / survey_HA_LLM
A survey on Hardware Accelerated LLMs
☆56Updated 6 months ago
aitomatic / semikong
First Open-Source Industry-Specific Model for Semiconductors
☆348Updated 2 months ago
Cornell-RelaxML / qtip
☆139Updated 3 weeks ago
hahnyuan / PB-LLM
PB-LLM: Partially Binarized Large Language Models
☆152Updated last year
tensil-ai / tensil
Open source machine learning accelerators
☆382Updated last year
astramind-ai / BitMat
An efficent implementation of the method proposed in "The Era of 1-bit LLMs"
☆153Updated 9 months ago
Aaronhuang-778 / BiLLM
[ICML 2024] BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
☆222Updated 6 months ago
pulp-platform / occamy
A high-efficiency system-on-chip for floating-point compute workloads.
☆38Updated 6 months ago
litex-hub / linux-on-litex-rocket
Run 64-bit Linux on LiteX + RocketChip
☆200Updated 11 months ago
ztachip / ztachip
Opensource software/hardware platform to build edge AI solutions deployed on FPGA or custom ASIC hardware.
☆256Updated 3 months ago
turingmotors / swan
This project aims to enable language model inference on FPGAs, supporting AI applications in edge devices and environments with limited r…
☆163Updated last year
FlightLLM / flightllm_test_demo
☆24Updated last year
SeoLabCornell / torch2chip
Torch2Chip (MLSys, 2024)
☆53Updated 3 months ago
eevaain / tiny-tpu
A minimal Tensor Processing Unit (TPU) inspired by Google's TPUv1.
☆164Updated 11 months ago
IST-DASLab / qmoe
Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".
☆277Updated last year
getinstachip / vpm
Verilog package manager written in Rust
☆143Updated 9 months ago
sfmth / OpenSpike
Fully opensource spiking neural network accelerator
☆152Updated 2 years ago
OpenGVLab / EfficientQAT
[ACL 2025 Main] EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
☆277Updated last month
kotak-ai / 1.58BitNet
Experimental BitNet Implementation
☆68Updated 3 weeks ago
tenstorrent / riscv-ocelot
Ocelot: The Berkeley Out-of-Order Machine With V-EXT support
☆170Updated 6 months ago
rafacelente / bllama
1.58-bit LLaMa model
☆81Updated last year
aliemo / transfomers-silicon-research
Research and Materials on Hardware implementation of Transformer Model
☆268Updated 4 months ago
groq / groqflow
GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing tho…
☆108Updated 2 months ago