99991 / pygguf
GGUF parser in Python
☆26Updated 6 months ago
Alternatives and similar repositories for pygguf:
Users that are interested in pygguf are comparing it to the libraries listed below
- Some random tools for working with the GGUF file format☆25Updated last year
- RWKV-7: Surpassing GPT☆79Updated 3 months ago
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆40Updated last year
- Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees" adapted for Llama models☆36Updated last year
- Experiments with BitNet inference on CPU☆53Updated 10 months ago
- Repository for CPU Kernel Generation for LLM Inference☆25Updated last year
- ☆16Updated 11 months ago
- RWKV in nanoGPT style☆187Updated 8 months ago
- Python bindings for ggml☆137Updated 5 months ago
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Updated last year
- new optimizer☆19Updated 6 months ago
- A safetensors extension to efficiently store sparse quantized tensors on disk☆77Updated this week
- ☆53Updated 8 months ago
- QuIP quantization☆50Updated 11 months ago
- tinygrad port of the RWKV large language model.☆44Updated 8 months ago
- GGML implementation of BERT model with Python bindings and quantization.☆53Updated last year
- Boosting 4-bit inference kernels with 2:4 Sparsity☆64Updated 5 months ago
- ☆65Updated 2 months ago
- ☆49Updated 11 months ago
- ☆48Updated 3 months ago
- My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated☆31Updated 6 months ago
- Implementation of nougat that focuses on processing pdf locally.☆79Updated last month
- QLoRA with Enhanced Multi GPU Support☆36Updated last year
- ☆21Updated 3 months ago
- A pipeline for LLM knowledge distillation☆91Updated 3 weeks ago
- Lightweight continuous batching OpenAI compatibility using HuggingFace Transformers include T5 and Whisper.☆20Updated this week
- ☆44Updated 7 months ago
- A general 2-8 bits quantization toolbox with GPTQ/AWQ/HQQ/VPTQ, and export to onnx/onnx-runtime easily.☆159Updated last week