gioelecrispo / chunkipy
chunkipy is an extremely useful tool for segmenting long texts into smaller chunks, based on either a character or token count. With customizable chunk sizes and splitting strategies, chunkipy provides flexibility and control for various text processing tasks.
β34Updated last year
Alternatives and similar repositories for chunkipy:
Users that are interested in chunkipy are comparing it to the libraries listed below
- 𧑠Hacker News summariesβ19Updated 10 months ago
- Mistral + Haystack: build RAG pipelines that rock π€β100Updated last year
- Convert a web page to markdownβ63Updated 5 months ago
- β45Updated 10 months ago
- Chunk your text using gpt4o-mini more accuratelyβ43Updated 6 months ago
- Data extraction with LLM on CPUβ112Updated last year
- A framework that uses multi-agents to enable users to perform a systematic data science pipeline with just two inputs.β38Updated 6 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing β‘β64Updated 3 months ago
- Writing Blog Posts with Generative Feedback Loops!β47Updated 11 months ago
- π Reference-Free automatic summarization evaluation with potential hallucination detectionβ101Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β48Updated 7 months ago
- β58Updated 3 months ago
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafteβ¦β60Updated 3 months ago
- Table detection with Florence.β13Updated 7 months ago
- Data extraction with LLM on CPUβ67Updated last year
- A reimplementation of langgraph's customer support example in Rasa's CALM paradigm and a quantiative evaluation of the 2 approachesβ74Updated last week
- β18Updated 4 months ago
- Example demonstrating how to use gpt-4o-mini for fine-tuningβ25Updated 5 months ago
- β58Updated last year
- Dataset Viber is your chill repo for data collection, annotation and vibe checks.β44Updated 5 months ago
- β65Updated 8 months ago
- β29Updated 11 months ago
- Explore the use of DSPy for extracting features from PDFs πβ38Updated 11 months ago
- β20Updated last year
- A stable, fast and easy-to-use inference library with a focus on a sync-to-async APIβ45Updated 4 months ago
- A tutorial on DSPy and whether automated prompt engineering lives up to the hypeβ22Updated 9 months ago
- Using open source LLMs to build synthetic datasets for direct preference optimizationβ57Updated 11 months ago
- A personal knowledge base that I can dump information to and help me learnβ24Updated 8 months ago
- langchain-streamlit demo with streaming llm, memory, and langsmith feedbackβ18Updated last week
- β47Updated last year