MrYxJ / calculate-flops.pytorch
The calflops is designed to calculate FLOPs、MACs and Parameters in all various neural networks, such as Linear、 CNN、 RNN、 GCN、Transformer(Bert、LlaMA etc Large Language Model)
☆573Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for calculate-flops.pytorch
- A curated reading list of research in Mixture-of-Experts(MoE).☆538Updated 3 weeks ago
- Awesome list for LLM pruning.☆167Updated this week
- [TMLR 2024] Efficient Large Language Models: A Survey☆1,025Updated last week
- A collection of AWESOME things about mixture-of-experts☆974Updated 3 months ago
- Tutel MoE: An Optimized Mixture-of-Experts Implementation☆735Updated this week
- [NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baich…☆874Updated last month
- Lossless Training Speed Up by Unbiased Dynamic Data Pruning☆318Updated last month
- 📰 Must-read papers and blogs on Speculative Decoding ⚡️☆473Updated this week
- Awesome LLM pruning papers all-in-one repository with integrating all useful resources and insights.☆38Updated last week
- Fast inference from large lauguage models via speculative decoding☆573Updated 3 months ago
- This is a collection of our zero-cost NAS and efficient vision applications.☆379Updated last year
- We introduce a novel approach for parameter generation, named neural network parameter diffusion (p-diff), which employs a standard laten…☆835Updated this week
- A curated list for Efficient Large Language Models☆1,270Updated this week
- [NeurIPS 2022] A Fast Post-Training Pruning Framework for Transformers☆171Updated last year
- A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including languag…☆153Updated 2 weeks ago
- Awesome LLM compression research papers and tools.☆1,202Updated this week
- ☆169Updated 3 months ago
- Code repo for the paper "LLM-QAT Data-Free Quantization Aware Training for Large Language Models"☆254Updated 2 months ago
- A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models☆642Updated last year
- A general and accurate MACs / FLOPs profiler for PyTorch models☆571Updated 6 months ago
- AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning (ICLR 2023).☆275Updated last year
- Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline mod…☆311Updated 2 months ago
- ☆289Updated 7 months ago
- Survey Paper List - Efficient LLM and Foundation Models☆220Updated 2 months ago
- List of papers related to neural network quantization in recent AI conferences and journals.☆459Updated 2 months ago
- [ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation☆633Updated last month
- ☆575Updated last week
- [NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.☆391Updated 3 months ago
- USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference☆362Updated this week
- [COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding☆230Updated 2 months ago