WebGPU LLM inference tuned by hand
☆150Jun 24, 2023Updated 2 years ago
Alternatives and similar repositories for token-hawk
Users that are interested in token-hawk are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official code for ACL 2023 (short, findings) paper "Recursion of Thought: A Divide and Conquer Approach to Multi-Context Reasoning with L…☆45Jun 13, 2023Updated 3 years ago
- Deploy your GGML models to HuggingFace Spaces with Docker and gradio☆38Jun 6, 2023Updated 3 years ago
- Inference Llama 2 in one file of pure JavaScript(HTML)☆36May 20, 2025Updated last year
- A guidance language for controlling large language models.☆43Jun 9, 2023Updated 3 years ago
- A minimal metal application☆14Mar 24, 2021Updated 5 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A text-based, 5e-compatible RPG with an AI Dungeon Master that rolls real dice, tracks real stats, and plays by the rules. Built on the S…☆37May 18, 2026Updated 3 weeks ago
- ggml implementation of BERT☆500Feb 23, 2024Updated 2 years ago
- Erudito: Easy API/CLI to ask questions about your documentation☆98Nov 6, 2023Updated 2 years ago
- A torchless, c++ rwkv implementation using 8bit quantization, written in cuda/hip/vulkan for maximum compatibility and minimum dependenci…☆312Jan 31, 2024Updated 2 years ago
- Suno AI's Bark model in C/C++ for fast text-to-speech generation☆864Nov 16, 2024Updated last year
- Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)☆572Aug 8, 2023Updated 2 years ago
- Makes llama.cpp easy to use.☆12May 14, 2025Updated last year
- A Next.js chat app to use Llama 2 locally using node-llama-cpp☆12Oct 27, 2024Updated last year
- C++ implementation for 💫StarCoder☆458Sep 9, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- various experiments for scaling inference time compute with small reasoning models☆17Jan 16, 2025Updated last year
- ☆34May 28, 2023Updated 3 years ago
- "rust-openai-chatgpt-api" is a Rust library for accessing the ChatGPT API, a powerful NLP platform by OpenAI. The library provides a simp…☆11Mar 30, 2023Updated 3 years ago
- Simple script to re-rank images using OpenAI's CLIP https://github.com/openai/CLIP.☆15May 3, 2021Updated 5 years ago
- A fork of textgen that kept some things like Exllama and old GPTQ.☆22Aug 20, 2024Updated last year
- minimal diffusion transformer in pytorch.☆17Oct 6, 2024Updated last year
- Convenient wrapper for fine-tuning and inference of Large Language Models (LLMs) with several quantization techniques (GTPQ, bitsandbytes…☆144Oct 17, 2023Updated 2 years ago
- A lightweight Python utility that aggregates and exports comprehensive system information to JSON, specifically designed for feeding syst…☆13Apr 13, 2025Updated last year
- Python bindings for the Transformer models implemented in C/C++ using GGML library.☆1,886Jan 28, 2024Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- trying to make WebGPU a bit easier to use☆19Jan 9, 2024Updated 2 years ago
- Load multiple LoRA modules simultaneously and automatically switch the appropriate combination of LoRA modules to generate the best answe…☆161Feb 9, 2024Updated 2 years ago
- INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model☆1,572Mar 23, 2025Updated last year
- Efficient 3bit/4bit quantization of LLaMA models☆18May 18, 2023Updated 3 years ago
- ☆456Oct 15, 2023Updated 2 years ago
- LLM-based code completion engine☆194Jan 23, 2025Updated last year
- A Swift package for interacting with selenium and undetected-chromedriver through python by using PythonKit.☆13Jun 21, 2025Updated 11 months ago
- SparseGPT + GPTQ Compression of LLMs like LLaMa, OPT, Pythia☆42Mar 13, 2023Updated 3 years ago
- Plug n Play GBNF Compiler for llama.cpp☆32Nov 8, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Emacs package for LLM-assisted code/text completion☆43May 22, 2026Updated 3 weeks ago
- ☆16Dec 16, 2024Updated last year
- A repo to hold some simple experiments☆14May 4, 2022Updated 4 years ago
- Landmark Attention: Random-Access Infinite Context Length for Transformers☆426Dec 20, 2023Updated 2 years ago
- A Javascript library (with Typescript types) to parse metadata of GGML based GGUF files.☆52Jul 30, 2024Updated last year
- A cross-platform browser ML framework.☆763May 26, 2026Updated 2 weeks ago
- Download full or partial git-lfs repos without temporarily using 2x disk space☆32Oct 13, 2023Updated 2 years ago