A curated list for Efficient Large Language Models
☆11Mar 25, 2024Updated 2 years ago
Alternatives and similar repositories for Awesome-Efficient-LLM
Users that are interested in Awesome-Efficient-LLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆11Mar 29, 2020Updated 6 years ago
- OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents☆23Jan 6, 2026Updated 2 months ago
- [NeurIPS 2024] BLAST: Block Level Adaptive Structured Matrix for Efficient Deep Neural Network Inference☆17Nov 6, 2024Updated last year
- ☆18Mar 4, 2025Updated last year
- Estimating hardware and cloud costs of LLMs and transformer projects☆21Jan 15, 2026Updated 2 months ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- ☆15Jun 26, 2024Updated last year
- A reading group for system verification papers☆10Sep 28, 2023Updated 2 years ago
- raytracer☆10Jul 18, 2022Updated 3 years ago
- Userspace eBPF Runtime Benchmarking Test Suite and Results☆16Updated this week
- A telegram bot that sends you a message when the GPU is in use☆10May 27, 2024Updated last year
- DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling☆22Mar 18, 2026Updated last week
- 面向多平台编译优化的深度学习中间表示☆10Oct 28, 2024Updated last year
- MATLAB/Octave generator of Hamming ECC coding. Output format is Verilog HDL.☆12Dec 27, 2022Updated 3 years ago
- A unified programming framework for high and portable performance across FPGAs and GPUs☆11Mar 23, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- An MLIR-based AI compiler designed for Python frontend to RISC-V DSA☆13Oct 10, 2024Updated last year
- EdgeRag is a program that runs large language models and vector databases on your local device☆14May 29, 2024Updated last year
- Use tensor core to calculate back-to-back HGEMM (half-precision general matrix multiplication) with MMA PTX instruction.☆13Nov 3, 2023Updated 2 years ago
- First Latency-Aware Competitive LLM Agent Benchmark☆26Jun 3, 2025Updated 9 months ago
- File System in User Space☆13Oct 31, 2019Updated 6 years ago
- Paper-reading notes for Berkeley OS prelim exam.☆14Aug 28, 2024Updated last year
- FPGA acceleration of arbitrary precision floating point computations.☆40May 17, 2022Updated 3 years ago
- 🌏 Teddy is a tiny but scalable http server based on Java NIO, inspired by netty.☆11Dec 26, 2019Updated 6 years ago
- ASKAP Benchmark Packages☆13Nov 3, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- The source code for GPGPUSim+Ramulator simulator. In this version, GPGPUSim uses Ramulator to simulate the DRAM. This simulator is used t…☆60Sep 30, 2019Updated 6 years ago
- A FPGA-based neural network inference accelerator, which won the third place in DAC-SDC☆28May 11, 2022Updated 3 years ago
- GEMM and Winograd based convolutions using CUTLASS☆28Jul 15, 2020Updated 5 years ago
- 《人工智能法规、伦理与社会影响》书稿☆13Aug 28, 2021Updated 4 years ago
- Repo for collaboration on OSS agentic code search☆44Updated this week
- CLI that uses DSPy to interact with MCP servers.☆24Mar 10, 2025Updated last year
- 2-8bit weights, 8-bit activations flexible Neural Processing Engine for PULP clusters☆31Jan 29, 2026Updated 2 months ago
- ☆14Mar 1, 2021Updated 5 years ago
- Text Classification Dataset for Turkish Language☆10Nov 16, 2021Updated 4 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- OS for fun☆11May 29, 2021Updated 4 years ago
- [ICML 2022] "DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks", by Yonggan …☆35Jul 12, 2022Updated 3 years ago
- A Kubernetes operator for running benchmark tests on databases to evaluate their performance.☆16Sep 28, 2025Updated 6 months ago
- STREAMer: Benchmarking remote volatile and non-volatile memory bandwidth☆17Aug 21, 2023Updated 2 years ago
- Python package for calculation and simulation of n-bodies interaction.☆30Apr 20, 2023Updated 2 years ago
- ☆12Feb 16, 2023Updated 3 years ago
- Topling core libraries in an ark☆17Updated this week