LeonEricsson / llmcontext
Pressure testing the context window of open LLMs
☆22Updated 5 months ago
Alternatives and similar repositories for llmcontext:
Users that are interested in llmcontext are comparing it to the libraries listed below
- an implementation of Self-Extend, to expand the context window via grouped attention☆118Updated last year
- [WIP] Transformer to embed Danbooru labelsets☆13Updated 10 months ago
- An unsupervised model merging algorithm for Transformers-based language models.☆104Updated 9 months ago
- ☆65Updated 8 months ago
- ☆74Updated last year
- ☆49Updated 11 months ago
- Preprint: Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning☆28Updated last year
- entropix style sampling + GUI☆25Updated 3 months ago
- QuIP quantization☆48Updated 10 months ago
- Modified Stanford-Alpaca Trainer for Training Replit's Code Model☆40Updated last year
- Full finetuning of large language models without large memory requirements☆93Updated last year
- Easy to use, High Performant Knowledge Distillation for LLMs☆45Updated last month
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆21Updated 2 months ago
- Using modal.com to process FineWeb-edu data☆20Updated 2 months ago
- GPT-2 small trained on phi-like data☆65Updated 11 months ago
- ☆111Updated last month
- RWKV-7: Surpassing GPT☆76Updated 2 months ago
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Updated last year
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆129Updated last week
- ☆45Updated last week
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vecto…☆43Updated 10 months ago
- ☆27Updated last year
- Simple LLM inference server☆20Updated 8 months ago
- Low-Rank adapter extraction for fine-tuned transformers models☆169Updated 9 months ago
- ☆123Updated 5 months ago
- GPU accelerated client-side embeddings for vector search, RAG etc.☆65Updated last year
- QLoRA: Efficient Finetuning of Quantized LLMs☆77Updated 10 months ago
- Chat Markup Language conversation library☆55Updated last year
- Lightweight tools for quick and easy LLM demo's☆26Updated 4 months ago
- Fast approximate inference on a single GPU with sparsity aware offloading☆38Updated last year