nyunAI / PruneGPTLinks
☆53Updated last year
Alternatives and similar repositories for PruneGPT
Users that are interested in PruneGPT are comparing it to the libraries listed below
Sorting:
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆78Updated last month
- Data preparation code for CrystalCoder 7B LLM☆45Updated last year
- Easy to use, High Performant Knowledge Distillation for LLMs☆85Updated last month
- Spherical Merge Pytorch/HF format Language Models with minimal feature loss.☆129Updated last year
- entropix style sampling + GUI☆26Updated 7 months ago
- GPT-4 Level Conversational QA Trained In a Few Hours☆62Updated 10 months ago
- My fork os allen AI's OLMo for educational purposes.☆30Updated 6 months ago
- Query-agnostic KV cache eviction: 3–4× reduction in memory and 2× decrease in latency (Qwen3/2.5, Gemma3, LLaMA3)☆85Updated last week
- ☆66Updated last year
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆36Updated last year
- Simple examples using Argilla tools to build AI☆53Updated 7 months ago
- ☆76Updated last year
- ☆51Updated 7 months ago
- Data preparation code for Amber 7B LLM☆91Updated last year
- This is the official repository for Inheritune.☆111Updated 4 months ago
- A repository aimed at pruning DeepSeek V3, R1 and R1-zero to a usable size☆58Updated 2 months ago
- A framework for evaluating function calls made by LLMs☆37Updated 10 months ago
- A pipeline for LLM knowledge distillation☆104Updated 2 months ago
- Load multiple LoRA modules simultaneously and automatically switch the appropriate combination of LoRA modules to generate the best answe…☆154Updated last year
- LLM-Training-API: Including Embeddings & ReRankers, mergekit, LaserRMT☆27Updated last year
- ☆47Updated 9 months ago
- QuIP quantization☆54Updated last year
- Low-Rank adapter extraction for fine-tuned transformers models☆173Updated last year
- The official code repo and data hub of top_nsigma sampling strategy for LLMs.☆26Updated 4 months ago
- Simple GRPO scripts and configurations.☆58Updated 4 months ago
- Just a bunch of benchmark logs for different LLMs☆119Updated 10 months ago
- ☆114Updated 6 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆66Updated 7 months ago
- minimal scripts for 24GB VRAM GPUs. training, inference, whatever☆40Updated last week
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆143Updated 9 months ago