A family of efficient edge language models in 100M~1B sizes.
☆19Feb 14, 2025Updated last year
Alternatives and similar repositories for EfficientLLM
Users that are interested in EfficientLLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The official repo for "CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models"☆31Mar 5, 2026Updated 3 weeks ago
- [NeurIPS 2023] ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer☆30Dec 6, 2023Updated 2 years ago
- [NeurIPS 24 Spotlight] MaskLLM: Learnable Semi-structured Sparsity for Large Language Models☆187Jan 1, 2025Updated last year
- [ICLR 2026] Learning to Parallel: Accelerating Diffusion Large Language Models via Learnable Parallel Decoding☆31Jan 27, 2026Updated last month
- The official implementations of Noise-Informed Diffusion-Generated Image Detection With Anomaly Attention (TIFS 2025)☆18Jun 23, 2025Updated 9 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- An Affordable LLM Pre-training Benchmark via Accurate Loss Prediction across Scales☆16Jun 6, 2024Updated last year
- ☆13Sep 25, 2023Updated 2 years ago
- ☆13Oct 13, 2025Updated 5 months ago
- [ICLR 2026] ParallelBench: Understanding the Tradeoffs of Parallel Decoding in Diffusion LLMs☆44Updated this week
- Code Repository for the NeurIPS 2024 Paper "Toward Efficient Inference for Mixture of Experts".☆19Oct 30, 2024Updated last year
- ☆12Apr 27, 2024Updated last year
- Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor Cores (EuroSys'25)☆15Jul 17, 2025Updated 8 months ago
- Python implementation of the Huffman Code compression algorithm.☆14Apr 18, 2013Updated 12 years ago
- NITEC: Versatile Hand-Annotated Eye Contact Dataset for Ego-Vision Interaction (WACV24)☆19Jul 17, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Self-Distribution BNN☆10Mar 8, 2022Updated 4 years ago
- Mono depth on nuscenes dataset☆21Feb 25, 2021Updated 5 years ago
- This is an implementation of the LaserNet paper, a CNN for Lidar 3D object detection☆16Jul 21, 2020Updated 5 years ago
- ☆14Dec 4, 2020Updated 5 years ago
- Code and data for ACL 2024 paper on 'Cross-Modal Projection in Multimodal LLMs Doesn't Really Project Visual Attributes to Textual Space'☆19Jul 21, 2024Updated last year
- [AAAI 2024] Fluctuation-based Adaptive Structured Pruning for Large Language Models☆72Jan 6, 2024Updated 2 years ago
- [NeurIPS 2025 D&B (Spotlight🌟)] TIME: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenario☆30Oct 5, 2025Updated 5 months ago
- Efficient 2:4 sparse training algorithms and implementations☆59Dec 8, 2024Updated last year
- This is official implementation of "Curriculum Fine-tuning of Vision Foundation Model for Medical Image Classification Under Label Noise"…☆21Mar 18, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆19Jul 30, 2021Updated 4 years ago
- Code repo for FaStfact: Faster, Stronger Long-Form Factuality Evaluations in LLMs.☆32Nov 5, 2025Updated 4 months ago
- Official PyTorch implementation of The Linear Attention Resurrection in Vision Transformer☆16Sep 7, 2024Updated last year
- ☆18Jun 6, 2019Updated 6 years ago
- [ICML 2025🔥] ParallelComp: Parallel Long-Context Compressor for Length Extrapolation☆30Jun 16, 2025Updated 9 months ago
- Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs☆23Nov 11, 2025Updated 4 months ago
- Finetune Google's pre-trained ViT models from HuggingFace's model hub.☆19Apr 4, 2021Updated 4 years ago
- Caffe implementation of Optimal-Ternary-Weights-Approximation in "Two-Step Quantization for Low-bit Neural Networks" (CVPR2018).☆14Sep 21, 2018Updated 7 years ago
- [DAC 2024] EDGE-LLM: Enabling Efficient Large Language Model Adaptation on Edge Devices via Layerwise Unified Compression and Adaptive La…☆84Jun 30, 2024Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Code to implement the experiments in "Post-training Quantization for Neural Networks with Provable Guarantees" by Jinjie Zhang, Yixuan Zh…☆11Jun 2, 2023Updated 2 years ago
- Deeplearning4j Android Example repository☆11Feb 8, 2016Updated 10 years ago
- End-to-End Neural Network for Multi-Object Tracking☆23Jun 17, 2019Updated 6 years ago
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models☆35Jun 12, 2024Updated last year
- ☆13Apr 10, 2017Updated 8 years ago
- ☆23Jun 13, 2024Updated last year
- ☆22Jun 10, 2025Updated 9 months ago