huggingface / hf_transfer
☆462Updated last month
Alternatives and similar repositories for hf_transfer
Users that are interested in hf_transfer are comparing it to the libraries listed below
Sorting:
- Official implementation of Half-Quadratic Quantization (HQQ)☆807Updated last week
- ☆531Updated 8 months ago
- Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"☆301Updated last year
- Inference code for Mistral and Mixtral hacked up into original Llama implementation☆371Updated last year
- ☆515Updated 5 months ago
- Minimalistic large language model 3D-parallelism training☆1,850Updated this week
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆323Updated 5 months ago
- A bagel, with everything.☆320Updated last year
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆199Updated 9 months ago
- This is our own implementation of 'Layer Selective Rank Reduction'☆238Updated 11 months ago
- Manage scalable open LLM inference endpoints in Slurm clusters☆256Updated 10 months ago
- Implementation of DoRA☆294Updated 11 months ago
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆154Updated 6 months ago
- Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.☆723Updated 7 months ago
- PyTorch building blocks for the OLMo ecosystem☆210Updated this week
- Advanced Quantization Algorithm for LLMs/VLMs.☆454Updated this week
- Inference code for Persimmon-8B☆415Updated last year
- scalable and robust tree-based speculative decoding algorithm☆345Updated 3 months ago
- Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".☆274Updated last year
- OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training☆491Updated 4 months ago
- ☆532Updated 6 months ago
- batched loras☆342Updated last year
- [ICML 2024] CLLMs: Consistency Large Language Models☆391Updated 5 months ago
- 🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…☆244Updated this week
- Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models☆237Updated last year
- A benchmark for emotional intelligence in large language models☆289Updated 9 months ago
- An Open Source Toolkit For LLM Distillation☆596Updated last week
- ☆713Updated last week
- [ACL 2024] Progressive LLaMA with Block Expansion.☆502Updated 11 months ago
- ☆706Updated last year