A curated list for Efficient Large Language Models
☆11Mar 25, 2024Updated 2 years ago
Alternatives and similar repositories for Awesome-Efficient-LLM
Users that are interested in Awesome-Efficient-LLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆11Mar 29, 2020Updated 6 years ago
- OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents☆25May 17, 2026Updated last week
- [NeurIPS 2024] BLAST: Block Level Adaptive Structured Matrix for Efficient Deep Neural Network Inference☆18Nov 6, 2024Updated last year
- ☆19Mar 4, 2025Updated last year
- Estimating hardware and cloud costs of LLMs and transformer projects☆22Apr 1, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆15Jun 26, 2024Updated last year
- A reading group for system verification papers☆10Sep 28, 2023Updated 2 years ago
- raytracer☆10Jul 18, 2022Updated 3 years ago
- Userspace eBPF Runtime Benchmarking Test Suite and Results☆16May 21, 2026Updated last week
- A telegram bot that sends you a message when the GPU is in use☆11May 27, 2024Updated 2 years ago
- DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling☆24May 20, 2026Updated last week
- 面向多平台编译优化的深度学习中间表示☆10Oct 28, 2024Updated last year
- MATLAB/Octave generator of Hamming ECC coding. Output format is Verilog HDL.☆12Dec 27, 2022Updated 3 years ago
- A unified programming framework for high and portable performance across FPGAs and GPUs☆11Mar 23, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- An MLIR-based AI compiler designed for Python frontend to RISC-V DSA☆14Oct 10, 2024Updated last year
- EdgeRag is a program that runs large language models and vector databases on your local device☆14May 29, 2024Updated 2 years ago
- Use tensor core to calculate back-to-back HGEMM (half-precision general matrix multiplication) with MMA PTX instruction.☆13Nov 3, 2023Updated 2 years ago
- First Latency-Aware Competitive LLM Agent Benchmark☆28Jun 3, 2025Updated 11 months ago
- File System in User Space☆13Oct 31, 2019Updated 6 years ago
- Paper-reading notes for Berkeley OS prelim exam.☆14Aug 28, 2024Updated last year
- FPGA acceleration of arbitrary precision floating point computations.☆41May 17, 2022Updated 4 years ago
- 🌏 Teddy is a tiny but scalable http server based on Java NIO, inspired by netty.☆11Dec 26, 2019Updated 6 years ago
- ASKAP Benchmark Packages☆13Nov 3, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- The source code for GPGPUSim+Ramulator simulator. In this version, GPGPUSim uses Ramulator to simulate the DRAM. This simulator is used t…☆60Sep 30, 2019Updated 6 years ago
- A FPGA-based neural network inference accelerator, which won the third place in DAC-SDC☆28May 11, 2022Updated 4 years ago
- GEMM and Winograd based convolutions using CUTLASS☆28Jul 15, 2020Updated 5 years ago
- CLI that uses DSPy to interact with MCP servers.☆24Mar 10, 2025Updated last year
- ☆14Mar 1, 2021Updated 5 years ago
- Text Classification Dataset for Turkish Language☆10Nov 16, 2021Updated 4 years ago
- OS for fun☆11May 29, 2021Updated 5 years ago
- [ICML 2022] "DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks", by Yonggan …☆35Jul 12, 2022Updated 3 years ago
- 2-8bit weights, 8-bit activations flexible Neural Processing Engine for PULP clusters☆32May 20, 2026Updated last week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A Kubernetes operator for running benchmark tests on databases to evaluate their performance.☆16Updated this week
- 《人工智能法规、伦理与社会影响》书稿☆14Aug 28, 2021Updated 4 years ago
- STREAMer: Benchmarking remote volatile and non-volatile memory bandwidth☆18Aug 21, 2023Updated 2 years ago
- Python package for calculation and simulation of n-bodies interaction.☆30Apr 20, 2023Updated 3 years ago
- Topling core libraries in an ark☆18May 17, 2026Updated last week
- ☆19Jun 3, 2025Updated 11 months ago
- Source code for ASRU 2019 paper "Adapting Pretrained Transformer to Lattices for Spoken Language Understanding"☆10Jul 8, 2020Updated 5 years ago