mrmps / ai-chunker
Chunk your text using gpt4o-mini more accurately
☆39Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for ai-chunker
- Generalist and Lightweight Model for Text Classification☆48Updated 2 months ago
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created by…☆27Updated 2 months ago
- Writing Blog Posts with Generative Feedback Loops!☆42Updated 7 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆48Updated 3 months ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆32Updated 2 months ago
- ☆75Updated 5 months ago
- Check for data drift between two OpenAI multi-turn chat jsonl files.☆36Updated 6 months ago
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. 👨🏻🍳☆164Updated 3 weeks ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆59Updated this week
- ☆46Updated 9 months ago
- Tools to make language models a bit easier to use☆30Updated last week
- Using open source LLMs to build synthetic datasets for direct preference optimization☆40Updated 8 months ago
- GLiNER model in a FastAPI microservice.☆28Updated 2 weeks ago
- Late Interaction Models Training & Retrieval☆158Updated last week
- ☆106Updated 2 weeks ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆97Updated 7 months ago
- Experimental Code for StructuredRAG: Structured Outputs in Retrieval-Augmented Generation☆90Updated this week
- Codebase accompanying the Summary of a Haystack paper.☆71Updated last month
- Resources for exploring Generative Feedback Loops with Weaviate!☆36Updated this week
- Explore the use of DSPy for extracting features from PDFs 🔎☆32Updated 8 months ago
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on task…☆129Updated last month
- Dataset Viber is your chill repo for data collection, annotation and vibe checks.☆42Updated 2 months ago
- 📝 Reference-Free automatic summarization evaluation with potential hallucination detection☆99Updated 9 months ago
- Mistral + Haystack: build RAG pipelines that rock 🤘☆100Updated 9 months ago
- Using short models to classify long texts☆20Updated last year
- ☆12Updated 6 months ago
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…☆52Updated last week
- ☆24Updated last year
- Let's build better datasets, together!☆202Updated 3 months ago
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.☆126Updated this week