zxytim / arithmetic-encoding-compressionLinks
☆11Updated 2 years ago
Alternatives and similar repositories for arithmetic-encoding-compression
Users that are interested in arithmetic-encoding-compression are comparing it to the libraries listed below
Sorting:
- A comprehensive overview of Data Distillation and Condensation (DDC). DDC is a data-centric task where a representative (i.e., small but …☆13Updated 2 years ago
- Benchmark tests supporting the TiledCUDA library.☆16Updated 6 months ago
- ☆31Updated last year
- [ICLR 2025] Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better☆14Updated 3 months ago
- [ICML 2023] "Data Efficient Neural Scaling Law via Model Reusing" by Peihao Wang, Rameswar Panda, Zhangyang Wang☆14Updated last year
- ☆20Updated last month
- 32 times longer context window than vanilla Transformers and up to 4 times longer than memory efficient Transformers.☆48Updated last year
- ☆32Updated this week
- Official repository for "RLVR-World: Training World Models with Reinforcement Learning", https://arxiv.org/abs/2505.13934☆36Updated this week
- Accelerate LLM preference tuning via prefix sharing with a single line of code☆41Updated last month
- ☆20Updated last year
- IntLLaMA: A fast and light quantization solution for LLaMA☆18Updated last year
- The official implementation for Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free☆40Updated 3 weeks ago
- [EVA ICLR'23; LARA ICML'22] Efficient attention mechanisms via control variates, random features, and importance sampling☆85Updated 2 years ago
- An Elegant Library for Bayesian Deep Learning in PyTorch☆25Updated 2 years ago
- Implementation of IceFormer: Accelerated Inference with Long-Sequence Transformers on CPUs (ICLR 2024).☆25Updated 11 months ago
- ☆14Updated 2 years ago
- ☆22Updated last year
- [ACL 2024] RelayAttention for Efficient Large Language Model Serving with Long System Prompts☆39Updated last year
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆27Updated last year
- Code associated with the paper **Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees**.☆28Updated 2 years ago
- [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…☆27Updated last year
- The code and data for the paper JiuZhang3.0☆45Updated last year
- Official Code For Dual Grained Quantization: Efficient Fine-Grained Quantization for LLM☆14Updated last year
- Open-Source LLM Coders with Co-Evolving Reinforcement Learning☆40Updated this week
- A lightweight reinforcement learning framework that integrates seamlessly into your codebase, empowering developers to focus on algorithm…☆31Updated 2 weeks ago
- ☆20Updated 2 years ago
- LongSpec: Long-Context Speculative Decoding with Efficient Drafting and Verification☆54Updated 3 months ago
- Beyond KV Caching: Shared Attention for Efficient LLMs☆19Updated 10 months ago
- AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers☆46Updated 2 years ago