rejunity / tiny-asic-1_58bit-matrix-mulLinks
Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit
☆159Updated last year
Alternatives and similar repositories for tiny-asic-1_58bit-matrix-mul
Users that are interested in tiny-asic-1_58bit-matrix-mul are comparing it to the libraries listed below
Sorting:
- ☆99Updated last year
- Machine-Learning Accelerator System Exploration Tools☆173Updated 2 months ago
- ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization☆110Updated 10 months ago
- An AI accelerator implementation with Xilinx FPGA☆51Updated 7 months ago
- The Riallto Open Source Project from AMD☆82Updated 4 months ago
- A high-efficiency system-on-chip for floating-point compute workloads.☆40Updated 7 months ago
- A survey on Hardware Accelerated LLMs☆59Updated 7 months ago
- minimal C implementation of speculative decoding based on llama2.c☆24Updated last year
- Run 64-bit Linux on LiteX + RocketChip☆201Updated last month
- Fully opensource spiking neural network accelerator☆154Updated 2 years ago
- A new LLM solution for RTL code generation, achieving state-of-the-art performance in non-commercial solutions and outperforming GPT-3.5.☆220Updated 6 months ago
- ☆149Updated 2 months ago
- Universal Memory Interface (UMI)☆148Updated last week
- Torch2Chip (MLSys, 2024)☆53Updated 4 months ago
- Ocelot: The Berkeley Out-of-Order Machine With V-EXT support☆174Updated last week
- This project aims to enable language model inference on FPGAs, supporting AI applications in edge devices and environments with limited r…☆165Updated last year
- [ICML 2024] BiLLM: Pushing the Limit of Post-Training Quantization for LLMs☆224Updated 7 months ago
- DNN Compiler for Heterogeneous SoCs☆44Updated this week
- Samples of good AI generated CUDA kernels☆89Updated 3 months ago
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆155Updated 10 months ago
- A minimal Tensor Processing Unit (TPU) inspired by Google's TPUv1.☆176Updated last year
- BitNet a4.8 Implementation in one file of pytorch☆16Updated 7 months ago
- Research and Materials on Hardware implementation of Transformer Model☆279Updated 6 months ago
- Experimental BitNet Implementation☆69Updated 2 months ago
- PB-LLM: Partially Binarized Large Language Models☆153Updated last year
- QuIP quantization☆57Updated last year
- Inference RWKV v7 in pure C.☆38Updated last week
- LLM Agent for Hardware Description Language☆19Updated 2 months ago
- NeuraLUT-Assemble☆38Updated last week
- ☆27Updated last year