eth-easl / deltazip
Compression for Foundation Models
☆19Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for deltazip
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆38Updated 10 months ago
- ☆38Updated 4 months ago
- [EMNLP 2024 Main] Virtual Personas for Language Models via an Anthology of Backstories☆18Updated this week
- ☆35Updated 3 weeks ago
- Make triton easier☆41Updated 5 months ago
- ☆43Updated 4 months ago
- [ACL 2024] RelayAttention for Efficient Large Language Model Serving with Long System Prompts☆34Updated 8 months ago
- A toolkit for fine-tuning, inferencing, and evaluating GreenBitAI's LLMs.☆74Updated last month
- Experiments to assess SPADE on different LLM pipelines.☆16Updated 7 months ago
- Code repository for the public reproduction of the language modelling experiments on "MatFormer: Nested Transformer for Elastic Inference…☆18Updated last year
- Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding☆79Updated this week
- ☆22Updated 10 months ago
- LLMs as Collaboratively Edited Knowledge Bases☆43Updated 9 months ago
- PostText is a QA system for querying your text data. When appropriate structured views are in place, PostText is good at answering querie…☆31Updated last year
- Cascade Speculative Drafting☆26Updated 7 months ago
- Implementation of Hyena Hierarchy in JAX☆10Updated last year
- ☆57Updated last week
- Bamboo-7B Large Language Model☆89Updated 7 months ago
- ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization☆87Updated last month
- Odysseus: Playground of LLM Sequence Parallelism☆57Updated 5 months ago
- Structured inference with Llama 2 in your browser☆52Updated 2 weeks ago
- FlexAttention w/ FlashAttention3 Support☆27Updated last month
- ☆20Updated 2 months ago
- Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees" adapted for Llama models☆36Updated last year
- DPO, but faster 🚀☆23Updated 3 weeks ago
- Efficient, Flexible and Portable Structured Generation☆53Updated this week
- Efficient Dictionary Learning with Switch Sparse Autoencoders (SAEs)☆12Updated last month
- ☆24Updated last year
- A repository for research on medium sized language models.☆74Updated 5 months ago