mirth/chonky

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/mirth/chonky)

mirth / chonky

Fully neural approach for text chunking

☆416

Alternatives and similar repositories for chonky

Users that are interested in chonky are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

dipampaul17 / KVSplit
View on GitHub
Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit …
☆360May 21, 2025Updated last year
devflowinc / hn-search-RAG
View on GitHub
Hacker News Search and RAG built using Rust actix-web, minijinja, SolidJS, Vite, and Redis queue's
☆32Dec 11, 2024Updated last year
marv1nnnnn / llm-min.txt
View on GitHub
Min.js Style Compression of Tech Docs for LLM Context
☆677Oct 5, 2025Updated 9 months ago
babycommando / neuralgraffiti
View on GitHub
Live-bending a foundation model’s output at neural network level.
☆275Apr 7, 2025Updated last year
zoner72 / Datavizion-RAG
View on GitHub
Retrieval-augmented generation (RAG) for remote & local LLM use
☆45May 24, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
imdj / HNRelevant
View on GitHub
A browser extension that adds a "Related Submissions" section to Hacker News
☆146Jun 21, 2026Updated last month
Foreseerr / TScale
View on GitHub
☆197May 5, 2025Updated last year
prateekvellala / retrieval-experiments
View on GitHub
Exploring retrieval systems for language models
☆14Apr 12, 2025Updated last year
nathanrs / gpt2-webgl
View on GitHub
A browser-based, WebGL2 implementation of GPT-2 with transform block and attention matrix visualization
☆346Oct 24, 2025Updated 8 months ago
RingsNetwork / rings-wasm-p2p
View on GitHub
This example shows how rings network works in wasm and browser envirement.
☆14Jan 22, 2024Updated 2 years ago
habedi / cogitator
View on GitHub
A Python toolkit for chain-of-thought prompting 🐍
☆187Jul 5, 2026Updated 2 weeks ago
pyrustic / jinbase
View on GitHub
Multi-model transactional embedded database
☆67Dec 10, 2024Updated last year
raphael-seo / Versatile-OCR-Program
View on GitHub
Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)
☆677May 13, 2026Updated 2 months ago
matiasmolinas / evolving-agents
View on GitHub
Your toolkit for autonomous, evolving agent ecosystems. Create, execute, govern, and evolve agents that learn from experience, collaborat…
☆452Nov 24, 2025Updated 7 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
SureScaleAI / cleverbee
View on GitHub
CleverBee - The Open Source Deep Researcher Tool
☆302Jan 31, 2026Updated 5 months ago
reltadev / github-assistant
View on GitHub
Explore GitHub repositories with natural language questions
☆98Dec 22, 2024Updated last year
morphik-org / morphik-core
View on GitHub
Open-source multimodal retrieval engine (Morphik Core). By Morphik — AI back office for skilled nursing & senior living (morphik.ai).
☆3,633Jul 5, 2026Updated 2 weeks ago
osmzoso / pbf2sqlite
View on GitHub
A command line tool for importing OpenStreetMap PBF or XML files into a SQLite database.
☆83Jun 14, 2026Updated last month
merliot / hub
View on GitHub
Merliot Device Hub
☆165Jun 11, 2025Updated last year
PaulPauls / llama3_interpretability_sae
View on GitHub
A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and full…
☆640Mar 23, 2025Updated last year
simonw / llm-hacker-news
View on GitHub
LLM plugin for pulling content from Hacker News
☆129May 5, 2025Updated last year
denoland / deno_pypi
View on GitHub
☆29Jul 15, 2026Updated last week
Pringled / pyversity
View on GitHub
Fast Diversification for Search & Retrieval
☆493May 24, 2026Updated last month
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
QDScholium / ScholiumAI
View on GitHub
Your AI research assistant
☆79Mar 31, 2025Updated last year
featureform / enrichmcp
View on GitHub
EnrichMCP is a python framework for building data driven MCP servers
☆645Mar 1, 2026Updated 4 months ago
slaily / aiosqlitepool
View on GitHub
🛡️A resilient, high-performance asynchronous connection pool layer for SQLite, designed for efficient and scalable database operations.
☆430Jul 21, 2025Updated last year
Z-Gort / Reservoirs-Lab
View on GitHub
☆281Jun 8, 2025Updated last year
olivierDuchenne / LLM_json_schema
View on GitHub
Guaranty the output of an LLM to follow a json schema.
☆25Dec 6, 2023Updated 2 years ago
M4THYOU / TokenDagger
View on GitHub
High-Performance Implementation of OpenAI's TikToken.
☆475Jul 3, 2025Updated last year
koaning / smartfunc
View on GitHub
Turn docstrings into LLM-functions
☆516Jun 15, 2026Updated last month
Brandon-c-tech / RAG-logger
View on GitHub
RAG Logger is an open-source logging tool designed specifically for Retrieval-Augmented Generation (RAG) applications. It serves as a lig…
☆226Dec 24, 2024Updated last year
stanford-mast / blast
View on GitHub
Open-source VMs-as-a-service
☆777May 29, 2026Updated last month
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
facebookresearch / MILS
View on GitHub
Code release for "LLMs can see and hear without any training"
☆460May 8, 2025Updated last year
tadpolehq / tadpole
View on GitHub
☆99Feb 21, 2026Updated 5 months ago
pig-dot-dev / muscle-mem
View on GitHub
A cache for AI agents to learn and replay complex behaviors.
☆763Jun 15, 2025Updated last year
rescrv / napkin
View on GitHub
Back-of-the-envelope stuffs in Python
☆20Sep 13, 2023Updated 2 years ago
ngafar / llama-scan
View on GitHub
Transcribe PDFs with local LLMs
☆813Jan 27, 2026Updated 5 months ago
arc-eng / cli
View on GitHub
Command-line interface for the Arcane Engine
☆43Oct 31, 2024Updated last year
telekinesis-inc / aiopandas
View on GitHub
Lightweight Pandas monkey-patch that adds async support to map, apply, applymap, aggregate, and transform, enabling seamless handling of …
☆134Jun 12, 2025Updated last year