kir-gadjello / zipslicer
A library for incremental loading of large PyTorch checkpoints
☆56Updated 2 years ago
Alternatives and similar repositories for zipslicer:
Users that are interested in zipslicer are comparing it to the libraries listed below
- A playground to make it easy to try crazy things☆33Updated this week
- Tiny inference-only implementation of LLaMA☆92Updated last year
- ☆40Updated 2 years ago
- Implement recursion using English as the programming language and an LLM as the runtime.☆137Updated 2 years ago
- A fork of llama3.c used to do some R&D on inferencing☆20Updated 3 months ago
- ☆35Updated 2 years ago
- Converts JSON-Schema to GBNF grammar to use with llama.cpp☆52Updated last year
- Testing various image matching algorithms' performance on the Pinecone vector DB☆43Updated last year
- GPU accelerated client-side embeddings for vector search, RAG etc.☆66Updated last year
- A star for organising blocks and playing with transformers.☆23Updated 11 months ago
- Web browser version of StarCoder.cpp☆44Updated last year
- iterate quickly with llama.cpp hot reloading. use the llama.cpp bindings with bun.sh☆48Updated last year
- utilities for loading and running text embeddings with onnx☆44Updated 8 months ago
- assign color hues to a collection of text fragments based on embeddings☆20Updated 9 months ago
- A novel approach for transformer model introspection that enables saving, compressing, and manipulating internal thought states for advan…☆18Updated last week
- A copy of ONNX models, datasets, and code all in one GitHub repository. Follow the README to learn more.☆105Updated last year
- Fast inference of Instruct tuned LLaMa on your personal devices.☆22Updated 2 years ago
- WebGPU LLM inference tuned by hand☆149Updated last year
- Chat Markup Language conversation library☆55Updated last year
- Editor with LLM generation tree exploration☆65Updated 2 months ago
- Command-line script for inferencing from models such as MPT-7B-Chat☆101Updated last year
- GGML implementation of BERT model with Python bindings and quantization.☆56Updated last year
- Hierarchical topic segmentation of meeting transcripts using embeddings and divisive clustering.☆52Updated 8 months ago
- Simple LLM inference server☆20Updated 10 months ago
- Extensible AI assistant platform that bridges LLMs to tasks and actions☆38Updated last year
- Command-line script for inferencing from models such as LLaMA, in a chat scenario, with LoRA adaptations☆33Updated last year
- ☆31Updated last year
- ☆126Updated last year
- A super simple web interface to perform blind tests on LLM outputs.☆28Updated last year
- Tool to create a dataset of semantic segmentation on website screenshots from their DOM☆89Updated 2 years ago