epolewski / EricLLM
A fast batching API to serve LLM models
☆172Updated 6 months ago
Related projects ⓘ
Alternatives and complementary repositories for EricLLM
- Low-Rank adapter extraction for fine-tuned transformers model☆162Updated 6 months ago
- This is our own implementation of 'Layer Selective Rank Reduction'☆232Updated 5 months ago
- A multimodal, function calling powered LLM webui.☆208Updated last month
- Easily view and modify JSON datasets for large language models☆62Updated last month
- Experimental LLM Inference UX to aid in creative writing☆106Updated 4 months ago
- idea: https://github.com/nyxkrage/ebook-groupchat/☆82Updated 3 months ago
- Dataset Crafting w/ RAG/Wikipedia ground truth and Efficient Fine-Tuning Using MLX and Unsloth. Includes configurable dataset annotation …☆162Updated 4 months ago
- ☆112Updated this week
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆221Updated 6 months ago
- function calling-based LLM agents☆278Updated 2 months ago
- ☆149Updated 4 months ago
- A python application that routes incoming prompts to an LLM by category, and can support a single incoming connection from a front end to…☆167Updated this week
- ☆128Updated this week
- The RunPod worker template for serving our large language model endpoints. Powered by vLLM.☆244Updated 2 weeks ago
- A pipeline parallel training script for LLMs.☆83Updated this week
- Comparison of the output quality of quantization methods, using Llama 3, transformers, GGUF, EXL2.☆126Updated 6 months ago
- An extension that lets the AI take the wheel, allowing it to use the mouse and keyboard, recognize UI elements, and prompt itself :3...no…☆96Updated 3 weeks ago
- Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRA☆124Updated last year
- ☆227Updated last month
- ☆65Updated last month
- A simple experiment on letting two local LLM have a conversation about anything!☆91Updated 4 months ago
- Web UI for ExLlamaV2☆445Updated last month
- A frontend for creative writing with LLMs☆108Updated 4 months ago
- Open-source RAG app inspired by Perplexity.☆93Updated this week
- automatically quant GGUF models☆140Updated this week
- ☆118Updated 3 months ago
- Convenient wrapper for fine-tuning and inference of Large Language Models (LLMs) with several quantization techniques (GTPQ, bitsandbytes…☆145Updated last year
- A python package for developing AI applications with local LLMs.☆140Updated 4 months ago
- For inferring and serving local LLMs using the MLX framework☆89Updated 7 months ago
- An unsupervised model merging algorithm for Transformers-based language models.☆100Updated 6 months ago