chunkipy is an extremely useful tool for segmenting long texts into smaller chunks, based on either a character or token count. With customizable chunk sizes and splitting strategies, chunkipy provides flexibility and control for various text processing tasks.
☆37Mar 20, 2026Updated last month
Alternatives and similar repositories for chunkipy
Users that are interested in chunkipy are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆15Mar 18, 2025Updated last year
- Collection of usefull scripts for RunPod pods☆15Jan 26, 2024Updated 2 years ago
- exBERT on Transformers🤗☆10Jun 14, 2021Updated 4 years ago
- RAG Chatbot powered by Groq LPU, Ollama and Langchain☆13Mar 5, 2024Updated 2 years ago
- ☆14Sep 23, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [S&P 2026] SoK: Evaluating Jailbreak Guardrails for Large Language Models☆38Dec 17, 2025Updated 4 months ago
- RDF Community Discussions. Ask anything here!☆13Apr 11, 2024Updated 2 years ago
- Python package for developing and testing algorithmic trading strategies☆10Dec 2, 2021Updated 4 years ago
- ☆11Apr 17, 2023Updated 3 years ago
- This repository contains the implementation of an Image to DSL (Domain Specific Language) model. The model uses a pre-trained Vision Tran…☆13Apr 19, 2023Updated 3 years ago
- A Python reimplementation + extension of "Planning with Large Language Models for Code Generation" (https://arxiv.org/abs/2303.05510)☆18Dec 1, 2023Updated 2 years ago
- ZH-color-scheme & theme_stat template for ggplot2.☆18Apr 15, 2026Updated 2 weeks ago
- ☆15May 1, 2025Updated 11 months ago
- Deep Neural Networks for audio classification☆11Apr 11, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Collection of awesome LLM apps with RAG using OpenAI, Anthropic, Gemini and opensource models.☆35Dec 27, 2024Updated last year
- Official implementation repository for the paper Towards General Conceptual Model Editing via Adversarial Representation Engineering.☆20Dec 6, 2024Updated last year
- Presentation material for my talk at Pycon DE 2023: Intro on synthetic tabular data including synthetic data generation, evaluation metri…☆13Jan 26, 2026Updated 3 months ago
- Few Shot Learning using EleutherAI's GPT-Neo an Open-source version of GPT-3☆18Jul 8, 2021Updated 4 years ago
- IP Adapter FaceID demo webui☆20Dec 25, 2023Updated 2 years ago
- A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.☆610Mar 23, 2026Updated last month
- Projekt «Named Entity Recognition für die zentralen Serien des Staatsarchivs Kanton Zürich»☆10Jul 14, 2025Updated 9 months ago
- Named Entity Recognition☆19Feb 13, 2026Updated 2 months ago
- This is the official repo of "Quick Minutes of Meeting using ChatGPT" video on AI Anytime YouTube channel. We have used Da Vinci 003 mode…☆15Sep 27, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A simple command-line tool to calculate importance of tokens in prompts sent to an LLM.☆19Apr 3, 2026Updated 3 weeks ago
- ☆12Dec 8, 2024Updated last year
- TextComplexityDE dataset consists of 1000 sentences in the German language with subjective complexity rating, collected from German learn…☆12Apr 8, 2022Updated 4 years ago
- Eine Linksammlung zur Digitalisierung Deutschlands☆17May 5, 2022Updated 3 years ago
- SeamlessM4t-Translator: Utilizing the powerful Seamless M4t Facebook model in the backend, this project facilitates seamless translation …☆13Nov 9, 2023Updated 2 years ago
- Measure how understandable a German text is.☆12Apr 22, 2026Updated last week
- Starter Code (R and Python) for all CSV data sets of Team Data Shop, Statistical Office, Canton Zurich☆13Updated this week
- CLI to capture snapshots, short clips, and run motion detection against RTSP/ONVIF came ras☆79Updated this week
- Code to create the dataset from "A New Aligned Simple German Corpus☆11Jan 8, 2024Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Build your own AI friend☆22Apr 23, 2026Updated last week
- RestAI's Frontend☆22Sep 4, 2025Updated 7 months ago
- ☆38Nov 3, 2024Updated last year
- ☆16Apr 9, 2025Updated last year
- Catalogue pages of anti-patterns (common bad practices) in software project management and processes☆23Jan 17, 2026Updated 3 months ago
- ☆24Jan 30, 2025Updated last year
- ☆22Aug 24, 2023Updated 2 years ago