gioelecrispo/chunkipy

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/gioelecrispo/chunkipy)

gioelecrispo / chunkipy

chunkipy is an extremely useful tool for segmenting long texts into smaller chunks, based on either a character or token count. With customizable chunk sizes and splitting strategies, chunkipy provides flexibility and control for various text processing tasks.

☆37

Alternatives and similar repositories for chunkipy

Users that are interested in chunkipy are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

hossain-khan / firebase-mock-api-server
View on GitHub
A simple mock API server using expressjs that is hosted on firebase.
☆10Jun 29, 2022Updated 4 years ago
SALT-NLP / multi-value
View on GitHub
Complete set of English dialect transformation rules and evaluation code
☆16Jun 7, 2024Updated 2 years ago
hassancs91 / tiny-ai-agent-with-llm-router
View on GitHub
☆15Mar 18, 2025Updated last year
sileod / llm-theory-of-mind
View on GitHub
Testing Theory of Mind (ToM) in language models with epistemic logic
☆22Jul 3, 2026Updated 2 weeks ago
mickymultani / Groq-RAG
View on GitHub
RAG Chatbot powered by Groq LPU, Ollama and Langchain
☆13Mar 5, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
kmsravindra / bayesian_changepoint_detection
View on GitHub
Methods to get the probability of a changepoint in a time series.
☆11Jun 8, 2020Updated 6 years ago
entscheidsuche / NeueScraper
View on GitHub
Neue Scraper
☆11Updated this week
AdaptiveBProcess / DeepSimulator
View on GitHub
DeepSimulator is a hybrid tool between DDS and DL techniques to simulate business processes
☆11Mar 31, 2023Updated 3 years ago
UEWBot / dipvis
View on GitHub
Django-based visualiser for tournaments for the boardgame Diplomacy
☆10Updated this week
mzbac / image2dsl
View on GitHub
This repository contains the implementation of an Image to DSL (Domain Specific Language) model. The model uses a pre-trained Vision Tran…
☆13Apr 19, 2023Updated 3 years ago
justinwlin / runpodWhisperx
View on GitHub
Runpod WhisperX Docker Container Repo
☆16Mar 10, 2024Updated 2 years ago
ellenealds / streamlit_template_cohere_semantic_search
View on GitHub
☆11Apr 17, 2023Updated 3 years ago
AbdullahBahi / FX-Manager
View on GitHub
Python package for developing and testing algorithmic trading strategies
☆10Dec 2, 2021Updated 4 years ago
Scherzan / Unlocking-Information-Creating-Synthetic-Data-for-Open-Access
View on GitHub
Presentation material for my talk at Pycon DE 2023: Intro on synthetic tabular data including synthetic data generation, evaluation metri…
☆13Jan 26, 2026Updated 5 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
psunlpgroup / ReaLMistake
View on GitHub
This repository includes a benchmark and code for the paper "Evaluating LLMs at Detecting Errors in LLM Responses".
☆32Aug 18, 2024Updated last year
daisuke-motoki / change_point_detector
View on GitHub
Change point detection by using density ratio estimation
☆20Jun 3, 2017Updated 9 years ago
edenartlab / eve
View on GitHub
Eden is building autonomous creative agents.
☆30Jul 13, 2026Updated last week
lvrysis / Audio-DNN-Classification
View on GitHub
Deep Neural Networks for audio classification
☆10Apr 11, 2024Updated 2 years ago
GeekDream-x / IDOL
View on GitHub
Repo for paper "IDOL: Indicator-oriented Logic Pre-training for Logical Reasoning" accepted to the Findings of ACL 2023
☆22Nov 7, 2023Updated 2 years ago
Justmalhar / QuickUI-Agent
View on GitHub
💻 Hackable UI Generator with LLMs. Build quick MVP UIs with HTML, Tailwind, Font Awesome, Placehold.co and Groq for superfast generation…
☆28Jun 11, 2024Updated 2 years ago
h-ohsaki / graph-tools
View on GitHub
tools for graph theory and network science with many generation models
☆20May 20, 2026Updated 2 months ago
mozilla-ai / prompt-saliency
View on GitHub
A simple command-line tool to calculate importance of tokens in prompts sent to an LLM.
☆19Apr 3, 2026Updated 3 months ago
buschmo / Simple-German-Corpus
View on GitHub
Code to create the dataset from "A New Aligned Simple German Corpus
☆11Jan 8, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Zhang-Yihao / Adversarial-Representation-Engineering
View on GitHub
Official implementation repository for the paper Towards General Conceptual Model Editing via Adversarial Representation Engineering.
☆20Dec 6, 2024Updated last year
h1ddenpr0cess20 / infinigpt-matrix
View on GitHub
An AI chatbot for the Matrix chat protocol which can assume any personality imaginable
☆21Mar 17, 2026Updated 4 months ago
qurator-spk / sbb_ner
View on GitHub
Named Entity Recognition
☆19Feb 13, 2026Updated 5 months ago
babaknaderi / TextComplexityDE
View on GitHub
TextComplexityDE dataset consists of 1000 sentences in the German language with subjective complexity rating, collected from German learn…
☆12Apr 8, 2022Updated 4 years ago
AIAnytime / Quick-Minutes-of-Meeting-using-ChatGPT
View on GitHub
This is the official repo of "Quick Minutes of Meeting using ChatGPT" video on AI Anytime YouTube channel. We have used Da Vinci 003 mode…
☆15Sep 27, 2023Updated 2 years ago
codeforosnabrueck / awesome-opendata-german
View on GitHub
Eine kuratierte Liste hilfreicher Informationen zu Offenen Daten
☆20Jun 12, 2022Updated 4 years ago
machinelearningZH / zix_understandability-index
View on GitHub
Measure how understandable a German text is.
☆12Updated this week
mohak1 / Automatic-HTML-Code-Generation-from-Images
View on GitHub
This project makes use of Object Detection for identifying the symbols in the input image for generation of HTML code.
☆14Aug 30, 2024Updated last year
br-data / second-opinion
View on GitHub
Service for detecting hallucinations in AI generated text responses. Originally created for the AI for Media Network hackathon 2024.
☆19Oct 11, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
ahmetustun / udapter
View on GitHub
UDapter is a multilingual dependency parser that uses "contextual" adapters together with language-typology features for language-specifi…
☆31Dec 5, 2022Updated 3 years ago
Riminder / jobcurator
View on GitHub
Open source Machine Learning library to clean, normalize, structure, compress and sample large datasets & feeds of job offers.
☆15Nov 11, 2025Updated 8 months ago
Avaiga / demo-gpt-4o
View on GitHub
A Taipy Chatbot that supports images thanks to OpenAI's GPT-4o
☆22Apr 13, 2026Updated 3 months ago
Marcus0086 / formdata-etl_ai
View on GitHub
☆38Nov 3, 2024Updated last year
vinspdb / MiDA
View on GitHub
Multi view deep learning based approach for next activity prediction
☆13Apr 7, 2026Updated 3 months ago
jtlicardo / process-visualizer
View on GitHub
Converting textual descriptions of processes into simplified BPMN diagrams
☆17Dec 21, 2023Updated 2 years ago
apocas / restai-frontend
View on GitHub
RestAI's Frontend
☆22Sep 4, 2025Updated 10 months ago