kayvr / token-hawk
WebGPU LLM inference tuned by hand
☆147Updated last year
Related projects ⓘ
Alternatives and complementary repositories for token-hawk
- Extend the original llama.cpp repo to support redpajama model.☆117Updated 2 months ago
- LLaVA server (llama.cpp).☆177Updated last year
- Command-line script for inferencing from models such as MPT-7B-Chat☆102Updated last year
- Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRA☆124Updated last year
- Mistral7B playing DOOM☆122Updated 4 months ago
- Full finetuning of large language models without large memory requirements☆93Updated 10 months ago
- Fast parallel LLM inference for MLX☆149Updated 4 months ago
- An implementation of bucketMul LLM inference☆214Updated 4 months ago
- ☆40Updated last year
- TypeScript generator for llama.cpp Grammar directly from TypeScript interfaces☆131Updated 4 months ago
- Tensor library for machine learning☆279Updated last year
- inference code for mixtral-8x7b-32kseqlen☆98Updated 11 months ago
- GRDN.AI app for garden optimization☆69Updated 9 months ago
- Unofficial python bindings for the rust llm library. 🐍❤️🦀☆73Updated last year
- Run inference on replit-3B code instruct model using CPU☆154Updated last year
- tinygrad port of the RWKV large language model.☆43Updated 5 months ago
- LLM-based code completion engine☆175Updated last year
- Automated prompting and scoring framework to evaluate LLMs using updated human knowledge prompts☆111Updated last year
- an implementation of Self-Extend, to expand the context window via grouped attention☆118Updated 10 months ago
- GPT-2 small trained on phi-like data☆65Updated 9 months ago
- Python bindings for ggml☆132Updated 2 months ago
- ☆84Updated last month
- Add local LLMs to your Web or Electron apps! Powered by Rust + WebGPU☆102Updated last year
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆221Updated 6 months ago
- Command-line script for inferencing from models such as falcon-7b-instruct☆75Updated last year
- ☆149Updated 4 months ago
- MiniHF is an inference, human preference data collection, and fine-tuning tool for local language models. It is intended to help the user…☆151Updated this week
- Code for the paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot" with LLaMA implementation.☆70Updated last year