NolanoOrg / SpectraSuite
☆43Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for SpectraSuite
- A toolkit for fine-tuning, inferencing, and evaluating GreenBitAI's LLMs.☆72Updated 3 weeks ago
- A repository for research on medium sized language models.☆74Updated 5 months ago
- QuIP quantization☆46Updated 7 months ago
- The code repository for the CURLoRA research paper. Stable LLM continual fine-tuning and catastrophic forgetting mitigation.☆37Updated 2 months ago
- ☆35Updated last week
- PB-LLM: Partially Binarized Large Language Models☆146Updated 11 months ago
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆19Updated 9 months ago
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆104Updated last month
- ☆62Updated last month
- Here we will test various linear attention designs.☆56Updated 6 months ago
- ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization☆85Updated 3 weeks ago
- Fast approximate inference on a single GPU with sparsity aware offloading☆38Updated 10 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆49Updated 7 months ago
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆38Updated 9 months ago
- ☆41Updated last year
- [WIP] Transformer to embed Danbooru labelsets☆13Updated 7 months ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆29Updated 6 months ago
- Data preparation code for CrystalCoder 7B LLM☆42Updated 6 months ago
- ☆39Updated 9 months ago
- Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding☆70Updated this week
- ☆38Updated this week
- Cascade Speculative Drafting☆26Updated 7 months ago
- ☆24Updated last month
- This repo is based on https://github.com/jiaweizzhao/GaLore☆18Updated last month
- Repository for CPU Kernel Generation for LLM Inference☆24Updated last year
- ☆15Updated 7 months ago
- Official repository for the paper "NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks". This rep…☆26Updated last week
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Updated last year
- FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation☆45Updated 3 months ago