rejunity / tiny-asic-1_58bit-matrix-mulLinks
Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit
☆158Updated last year
Alternatives and similar repositories for tiny-asic-1_58bit-matrix-mul
Users that are interested in tiny-asic-1_58bit-matrix-mul are comparing it to the libraries listed below
Sorting:
- ☆97Updated last year
- ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization☆109Updated 9 months ago
- Machine-Learning Accelerator System Exploration Tools☆171Updated last month
- The Riallto Open Source Project from AMD☆81Updated 3 months ago
- An AI accelerator implementation with Xilinx FPGA☆48Updated 5 months ago
- A new LLM solution for RTL code generation, achieving state-of-the-art performance in non-commercial solutions and outperforming GPT-3.5.☆210Updated 5 months ago
- Verilog evaluation benchmark for large language model☆288Updated 5 months ago
- A survey on Hardware Accelerated LLMs☆56Updated 6 months ago
- First Open-Source Industry-Specific Model for Semiconductors☆348Updated 2 months ago
- ☆139Updated 3 weeks ago
- PB-LLM: Partially Binarized Large Language Models☆152Updated last year
- Open source machine learning accelerators☆382Updated last year
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆153Updated 9 months ago
- [ICML 2024] BiLLM: Pushing the Limit of Post-Training Quantization for LLMs☆222Updated 6 months ago
- A high-efficiency system-on-chip for floating-point compute workloads.☆38Updated 6 months ago
- Run 64-bit Linux on LiteX + RocketChip☆200Updated 11 months ago
- Opensource software/hardware platform to build edge AI solutions deployed on FPGA or custom ASIC hardware.☆256Updated 3 months ago
- This project aims to enable language model inference on FPGAs, supporting AI applications in edge devices and environments with limited r…☆163Updated last year
- ☆24Updated last year
- Torch2Chip (MLSys, 2024)☆53Updated 3 months ago
- A minimal Tensor Processing Unit (TPU) inspired by Google's TPUv1.☆164Updated 11 months ago
- Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".☆277Updated last year
- Verilog package manager written in Rust☆143Updated 9 months ago
- Fully opensource spiking neural network accelerator☆152Updated 2 years ago
- [ACL 2025 Main] EfficientQAT: Efficient Quantization-Aware Training for Large Language Models☆277Updated last month
- Experimental BitNet Implementation☆68Updated 3 weeks ago
- Ocelot: The Berkeley Out-of-Order Machine With V-EXT support☆170Updated 6 months ago
- 1.58-bit LLaMa model☆81Updated last year
- Research and Materials on Hardware implementation of Transformer Model☆268Updated 4 months ago
- GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing tho…☆108Updated 2 months ago