WebGPU LLM inference tuned by hand
☆150Jun 24, 2023Updated 2 years ago
Alternatives and similar repositories for token-hawk
Users that are interested in token-hawk are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official code for ACL 2023 (short, findings) paper "Recursion of Thought: A Divide and Conquer Approach to Multi-Context Reasoning with L…☆45Jun 13, 2023Updated 2 years ago
- Inference Llama 2 in one file of pure JavaScript(HTML)☆36May 20, 2025Updated 11 months ago
- A guidance language for controlling large language models.☆43Jun 9, 2023Updated 2 years ago
- A minimal metal application☆14Mar 24, 2021Updated 5 years ago
- ☆22Feb 21, 2023Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ggml implementation of BERT☆501Feb 23, 2024Updated 2 years ago
- Erudito: Easy API/CLI to ask questions about your documentation☆98Nov 6, 2023Updated 2 years ago
- A torchless, c++ rwkv implementation using 8bit quantization, written in cuda/hip/vulkan for maximum compatibility and minimum dependenci…☆312Jan 31, 2024Updated 2 years ago
- Suno AI's Bark model in C/C++ for fast text-to-speech generation☆858Nov 16, 2024Updated last year
- ☆11Oct 11, 2023Updated 2 years ago
- Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)☆570Aug 8, 2023Updated 2 years ago
- Makes llama.cpp easy to use.☆12May 14, 2025Updated 11 months ago
- GLiNER inference in JavaScript☆26Mar 2, 2025Updated last year
- A Next.js chat app to use Llama 2 locally using node-llama-cpp☆12Oct 27, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- C++ implementation for 💫StarCoder☆458Sep 9, 2023Updated 2 years ago
- Train llama with lora on one 4090 and merge weight of lora to work as stanford alpaca.☆52Jun 16, 2023Updated 2 years ago
- various experiments for scaling inference time compute with small reasoning models☆17Jan 16, 2025Updated last year
- ☆34May 28, 2023Updated 2 years ago
- Simple script to re-rank images using OpenAI's CLIP https://github.com/openai/CLIP.☆15May 3, 2021Updated 5 years ago
- A fork of textgen that kept some things like Exllama and old GPTQ.☆22Aug 20, 2024Updated last year
- A lightweight Python utility that aggregates and exports comprehensive system information to JSON, specifically designed for feeding syst…☆13Apr 13, 2025Updated last year
- Python bindings for the Transformer models implemented in C/C++ using GGML library.☆1,886Jan 28, 2024Updated 2 years ago
- Load multiple LoRA modules simultaneously and automatically switch the appropriate combination of LoRA modules to generate the best answe…☆159Feb 9, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model☆1,569Mar 23, 2025Updated last year
- Efficient 3bit/4bit quantization of LLaMA models☆18May 18, 2023Updated 2 years ago
- A minimal Python re-implementation of the A* with seed heuristic for exact global alignment (edit distance) in near-linear time☆22Nov 30, 2024Updated last year
- LLM-based code completion engine☆192Jan 23, 2025Updated last year
- A Swift package for interacting with selenium and undetected-chromedriver through python by using PythonKit.☆13Jun 21, 2025Updated 10 months ago
- SparseGPT + GPTQ Compression of LLMs like LLaMa, OPT, Pythia☆42Mar 13, 2023Updated 3 years ago
- ☆16Dec 16, 2024Updated last year
- A repo to hold some simple experiments☆14May 4, 2022Updated 4 years ago
- ☆13May 25, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆15Jun 5, 2023Updated 2 years ago
- A Javascript library (with Typescript types) to parse metadata of GGML based GGUF files.☆52Jul 30, 2024Updated last year
- Landmark Attention: Random-Access Infinite Context Length for Transformers☆426Dec 20, 2023Updated 2 years ago
- A cross-platform browser ML framework.☆757Apr 2, 2026Updated last month
- Download full or partial git-lfs repos without temporarily using 2x disk space☆31Oct 13, 2023Updated 2 years ago
- Builds Dawn on Linux and macOS as one single easier-to-use library☆29Dec 5, 2021Updated 4 years ago
- A distributed execution framework built upon lunatic.☆16Jan 19, 2024Updated 2 years ago