Adam-Mazur/Lazy-Llama

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Adam-Mazur/Lazy-Llama)

Adam-Mazur / Lazy-Llama

An implementation of LazyLLM token pruning for LLaMa 2 model family.

☆13

Alternatives and similar repositories for Lazy-Llama

Users that are interested in Lazy-Llama are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

microsoft / autodistil
View on GitHub
☆12Oct 17, 2023Updated 2 years ago
shaoyiHusky / SparseProgressiveDistillation
View on GitHub
☆12Aug 22, 2023Updated 2 years ago
EIT-NLP / SkipGPT
View on GitHub
[ICML 2025] Official implementation of the paper "SkipGPT: Dynamic Layer Pruning Reinvented with Token Awareness and Module Decoupling". …
☆21Nov 17, 2025Updated 8 months ago
tysonpond / symbolic-transfer-entropy
View on GitHub
My implementation of Symbolic Transfer Entropy (STE): a measure of asymmetric information flow between stochastic processes.
☆10Jul 9, 2019Updated 7 years ago
Stereotypes-in-LLMs / recruitment-dataset
View on GitHub
Repositories related to preprocessing and matching job profiles and anonymous candidates profiles based on data from Djinni.
☆14Dec 6, 2025Updated 7 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
nikita9604 / Scientific-Calculator-with-Keypad
View on GitHub
Scientific Calculator using Arduino Uno (Keypad & LCD)
☆18Nov 21, 2020Updated 5 years ago
Tangshengku / Bi-Mamba
View on GitHub
The official implementation of Bi-Mamba
☆17Oct 22, 2025Updated 9 months ago
Compositionality / compositionality-latex-template
View on GitHub
The Compositionality article class.
☆14Mar 16, 2026Updated 4 months ago
Tangshengku / DDR-Net
View on GitHub
The official implementation of "DDR-Net: Learning Multi-Stage Multi-View Stereo With Dynamic Depth Range"
☆24Aug 25, 2021Updated 4 years ago
zhenyuhe00 / BiPE
View on GitHub
Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation, ICML 2024
☆24Jun 26, 2024Updated 2 years ago
ucu-computer-science / Kyiv_emulator
View on GitHub
Emulator of one of the oldest machines in Europe - Kyiv.
☆21Apr 28, 2022Updated 4 years ago
3uchen / IdaLy
View on GitHub
A library and a software that support various algorithms for industrial data augmentation.
☆18Dec 16, 2024Updated last year
millerlp / Solarlib
View on GitHub
Arduino solar position calculator
☆20Apr 19, 2021Updated 5 years ago
Tangshengku / PCA-SVD-Autoencoder-Fourier-Wavelet-Transformation-for-denoising
View on GitHub
PCA-SVD-Autoencoder-Fourier-Wavelet-Transformation-for-denoising
☆22Feb 16, 2022Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
paolobrasolin / string-diagrams
View on GitHub
Create string diagrams with LaTeX!
☆14Jan 3, 2025Updated last year
IST-DASLab / DarwinLM
View on GitHub
Official Pytorch Implementation of Paper "DarwinLM: Evolutionary Structured Pruning of Large Language Models"
☆20Feb 21, 2025Updated last year
samchaineau / llm_slerp_generation
View on GitHub
Repo hosting codes and materials related to speeding LLMs' inference using token merging.
☆37Oct 9, 2025Updated 9 months ago
zinengtang / Perceiver_VL
View on GitHub
PyTorch code for "Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention" (WACV 2023)
☆34Feb 5, 2023Updated 3 years ago
ckkissane / sae-transfer
View on GitHub
Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"
☆13Jul 18, 2024Updated 2 years ago
rainstorm12 / KG-RAG
View on GitHub
简单实现了一下基于知识图谱和文本文档联合做检索增强(RAG)大模型的实现，这里采用的数据分别是管廊维护领域的文本文档和专家知识图谱
☆24Jun 6, 2024Updated 2 years ago
MacPaw / Screen2AX
View on GitHub
Repository for Screen2AX paper
☆32Jun 3, 2026Updated last month
MacPaw / Gliner2Swift
View on GitHub
Swift port of Gliner2 framework
☆36May 21, 2026Updated 2 months ago
alex-damian / EOS
View on GitHub
☆15Sep 29, 2022Updated 3 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
jsrozner / decrypt
View on GitHub
Repository for paper Decrypting Cryptic Crosswords
☆11Jan 15, 2022Updated 4 years ago
recursal / GoldFinch-paper
View on GitHub
GoldFinch and other hybrid transformer components
☆46Jul 20, 2024Updated 2 years ago
baaesh / BiLSTM-Generalized-Pooling-pytorch
View on GitHub
Enhancing Sentence Embedding with Generalized Pooling
☆11Jul 26, 2018Updated 8 years ago
TunnelTechnologies / dmrg-exact
View on GitHub
☆14Feb 25, 2019Updated 7 years ago
probcomp / genlm-control
View on GitHub
☆13Apr 17, 2025Updated last year
Bond1995 / Markov
View on GitHub
Code for experiments on transformers using Markovian data.
☆22Nov 22, 2024Updated last year
danieldeutsch / summarize
View on GitHub
☆12Nov 11, 2019Updated 6 years ago
liangyuwang / Tiny-Megatron
View on GitHub
Tiny-Megatron, a minimalistic re-implementation of the Megatron library
☆32Sep 1, 2025Updated 10 months ago
MacPaw / ai-sdk-typescript
View on GitHub
Official TypeScript SDK for AI Gateway - universal client for browser and Node.js (Chat, Embeddings, Images, Audio, NestJS module)
☆58Updated this week
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
lmsdss / LayerNorm-Scaling
View on GitHub
[NeurIPS 2025] Official Pytorch Implementation of "The Curse of Depth in Large Language Models" by Wenfang Sun, Xinyuan Song, Pengxiang L…
☆72Mar 3, 2026Updated 4 months ago
devanshj / prakaar
View on GitHub
A type programming language which compiles to and interops with type-level TypeScript
☆22Sep 9, 2022Updated 3 years ago
ilur98 / DGQ
View on GitHub
Official Code For Dual Grained Quantization: Efficient Fine-Grained Quantization for LLM
☆14Dec 27, 2023Updated 2 years ago
NolanoOrg / sparse_quant_llms
View on GitHub
SparseGPT + GPTQ Compression of LLMs like LLaMa, OPT, Pythia
☆42Mar 13, 2023Updated 3 years ago
MarkGHX / BiScope
View on GitHub
Official Implementation of NeurIPS 2024 paper - BiScope: AI-generated Text Detection by Checking Memorization of Preceding Tokens
☆29Feb 17, 2026Updated 5 months ago
milon27 / next-js-portfolio
View on GitHub
milon27.com portfolio application with NEXT JS and contentful
☆25Feb 25, 2022Updated 4 years ago
borjanG / 2023-transformers
View on GitHub
Codes for the paper The emergence of clusters in self-attention dynamics.
☆17Dec 18, 2023Updated 2 years ago