gioelecrispo / chunkipy
chunkipy is an extremely useful tool for segmenting long texts into smaller chunks, based on either a character or token count. With customizable chunk sizes and splitting strategies, chunkipy provides flexibility and control for various text processing tasks.
β35Updated last year
Alternatives and similar repositories for chunkipy:
Users that are interested in chunkipy are comparing it to the libraries listed below
- Explore the use of DSPy for extracting features from PDFs πβ39Updated last year
- π Reference-Free automatic summarization evaluation with potential hallucination detectionβ100Updated last year
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing β‘β66Updated 5 months ago
- Mistral + Haystack: build RAG pipelines that rock π€β103Updated last year
- β19Updated 6 months ago
- Using open source LLMs to build synthetic datasets for direct preference optimizationβ61Updated last year
- Writing Blog Posts with Generative Feedback Loops!β47Updated last year
- β45Updated last year
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafteβ¦β67Updated 5 months ago
- β66Updated 11 months ago
- Data extraction with LLM on CPUβ113Updated last year
- β20Updated last year
- A stable, fast and easy-to-use inference library with a focus on a sync-to-async APIβ45Updated 7 months ago
- Framework for building, orchestrating and deploying multi-agent systems. Managed by OpenAI Solutions team. Experimental framework.β90Updated 6 months ago
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created byβ¦β30Updated 8 months ago
- LLM prompt language based on Jinja. Banks provides tools and functions to build prompts text and chat messages from generic blueprints. Iβ¦β90Updated last week
- β30Updated 9 months ago
- Using various instructor clients evaluating the quality and capabilities of extractions and reasoning.β50Updated 6 months ago
- DSPy program/pipeline inspector widget for Jupyter/VSCode Notebooks.β34Updated last year
- β77Updated 10 months ago
- Dataset Viber is your chill repo for data collection, annotation and vibe checks.β47Updated 7 months ago
- Pre-train Static Word Embeddingsβ56Updated 2 weeks ago
- AI real estate agentβ34Updated last year
- β67Updated 5 months ago
- An integration of Qdrant ANN vector database backend with txtaiβ24Updated 8 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β49Updated 9 months ago
- A framework for evaluating function calls made by LLMsβ37Updated 9 months ago
- Low latency, High Accuracy, Custom Query routers for Humans and Agents. Built by Prithivi Daβ102Updated 3 weeks ago
- Check for data drift between two OpenAI multi-turn chat jsonl files.β37Updated last year
- Plug-and-play NLP pipelines without training.β50Updated this week