Lab 5 project of MIT-6.5940, deploying LLaMA2-7B-chat on one's laptop with TinyChatEngine.
☆18Dec 1, 2023Updated 2 years ago
Alternatives and similar repositories for LLaMA2-7B-on-laptop
Users that are interested in LLaMA2-7B-on-laptop are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- All Homeworks for TinyML and Efficient Deep Learning Computing 6.5940 • Fall • 2023 • https://efficientml.ai☆197Dec 2, 2023Updated 2 years ago
- 模型加速/模型压缩(已完成所有Lab)☆11Dec 24, 2023Updated 2 years ago
- Structured Pruning Adapters in PyTorch☆19Aug 30, 2023Updated 2 years ago
- Code release for AdapMoE accepted by ICCAD 2024☆38Apr 28, 2025Updated last year
- Official implementation of the paper: "A deeper look at depth pruning of LLMs"☆15Jul 24, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆180Aug 9, 2023Updated 2 years ago
- ☆10Feb 7, 2022Updated 4 years ago
- In this project, we provide a strong template for a PyTorch project. The purpose of this repository is to provide an example (and strong …☆15Apr 21, 2021Updated 5 years ago
- [ICLR 2023] PyTorch code for DFPC: Data flow driven pruning of coupled channels without data.☆15Aug 25, 2023Updated 2 years ago
- [OSDI 2025] DecDEC: A Systems Approach to Advancing Low‑Bit LLM Quantization☆24Jan 29, 2026Updated 3 months ago
- ☆78Nov 5, 2024Updated last year
- An implementation of Distortion-Free Wide-Angle Portraits on Camera Phones☆10Dec 24, 2019Updated 6 years ago
- Source code of our TNNLS paper "Boosting Convolutional Neural Networks with Middle Spectrum Grouped Convolution"☆12Apr 14, 2023Updated 3 years ago
- [NeurIPS 2024] Search for Efficient LLMs☆17Jan 16, 2025Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- [arXiv 2024] Is Oracle Pruning the True Oracle?☆26Jan 10, 2025Updated last year
- A Valgrind extension for CUDA, unofficial mirror for https://www.hlrs.de/organization/av/spmt/research/cudagrind/☆10Aug 5, 2015Updated 10 years ago
- ☆12Sep 18, 2024Updated last year
- This is a repository of coursework project for the Stanford Compilers MOOC course. The result is a fully-working compiler for the COOL Pr…☆18Sep 11, 2023Updated 2 years ago
- A cpp threadpool for c++11 c++14 c++17 c++20☆15Jun 30, 2023Updated 2 years ago
- Minimal implementation of Denoised Smoothing (https://arxiv.org/abs/2003.01908) in TensorFlow.☆20Aug 4, 2021Updated 4 years ago
- ☆12Jun 12, 2025Updated 10 months ago
- ☆10Oct 8, 2021Updated 4 years ago
- TinyML and Efficient Deep Learning Computing☆20Apr 26, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- code for the paper "A Statistical Framework for Low-bitwidth Training of Deep Neural Networks"☆29Oct 31, 2020Updated 5 years ago
- Implementation of the paper : Not all attention is needed - Gated Attention Network for Sequence Data (GA-Net) [https://arxiv.org/abs/191…☆13Aug 20, 2020Updated 5 years ago
- 🦙🦙.🦀☆28Sep 24, 2023Updated 2 years ago
- A docker image for One Student One Chip's debug exam☆10Sep 22, 2023Updated 2 years ago
- A fast implementation of Leiserchess AI for MIT 6.172`16 http://scrimmage.csail.mit.edu/☆12Dec 22, 2016Updated 9 years ago
- This repo contains the Assignments from Cornell Tech's ECE 5545 - Machine Learning Hardware and Systems offered in Spring 2023☆45May 31, 2023Updated 2 years ago
- ☆10Feb 17, 2022Updated 4 years ago
- Computer Vision Fall 2019 by Chiu-San Fu@ CSIE NTU Taiwan☆10Jan 11, 2020Updated 6 years ago
- 🔈 Sonos controller library written in Rust☆28Sep 23, 2021Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Sirius, an efficient correction mechanism, which significantly boosts Contextual Sparsity models on reasoning tasks while maintaining its…☆21Sep 10, 2024Updated last year
- 哈尔滨工业大学(深圳)2021年球季学期深度学习体系结构实验☆17Oct 1, 2022Updated 3 years ago
- ☆10Nov 14, 2023Updated 2 years ago
- Asynchronous Rust bindings for SPDK.☆18Nov 1, 2022Updated 3 years ago
- TRivia: Self-supervised Fine-tuning of Vision-Language Models for Table Recognition☆30Feb 5, 2026Updated 3 months ago
- 2023秋PKU编译原理lab,以及Koopa IR C++接口的文档☆16Feb 12, 2024Updated 2 years ago
- Official Pytorch Implementation of Paper "DarwinLM: Evolutionary Structured Pruning of Large Language Models"☆20Feb 21, 2025Updated last year