Python library to use Pleias-RAG models
☆68May 1, 2025Updated 10 months ago
Alternatives and similar repositories for Pleias-RAG-Library
Users that are interested in Pleias-RAG-Library are comparing it to the libraries listed below
Sorting:
- ☆21Oct 14, 2024Updated last year
- Model implementation for the contextual embeddings project☆41Jun 2, 2025Updated 9 months ago
- ☆15Apr 26, 2025Updated 10 months ago
- Label shift estimation for transfer difficulty with Familiarity.☆10Feb 4, 2025Updated last year
- FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction☆24Jun 6, 2022Updated 3 years ago
- Small python package to measure OCR quality and other related metrics.☆27Feb 19, 2024Updated 2 years ago
- Unofficial entropix impl for Gemma2 and Llama and Qwen2 and Mistral☆17Jan 12, 2025Updated last year
- ☆17May 8, 2024Updated last year
- 🕸 GlotCC Dataset and Pipline -- NeurIPS 2024☆20Apr 6, 2025Updated 11 months ago
- ☆92Jul 4, 2025Updated 8 months ago
- Generalist and Lightweight Model for Text Classification☆193Feb 17, 2026Updated 2 weeks ago
- Layout Analysis Dataset with Segmonto (LADaS)☆24Jul 12, 2025Updated 7 months ago
- Project code for training LLMs to write better unit tests + code☆21May 19, 2025Updated 9 months ago
- Learning BPE embeddings by first learning a segmentation model and then training word2vec☆19Dec 18, 2022Updated 3 years ago
- Efficiently find the best-suited language model (LM) for your NLP task☆135Jul 26, 2025Updated 7 months ago
- Fast Multimodal Semantic Deduplication & Filtering☆892Jan 20, 2026Updated last month
- code for training & evaluating Contextual Document Embedding models☆201May 14, 2025Updated 9 months ago
- ☆44Feb 11, 2026Updated 3 weeks ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆59Oct 18, 2025Updated 4 months ago
- Machine Learning for Cascading☆84Jun 12, 2015Updated 10 years ago
- Code for SaGe subword tokenizer (EACL 2023)☆27Nov 30, 2024Updated last year
- Next-generation Punkt sentence boundary detection with zero dependencies☆29Nov 18, 2025Updated 3 months ago
- Late Interaction Models Training & Retrieval☆732Feb 27, 2026Updated last week
- Tooling for exact and MinHash deduplication of large-scale text datasets☆72Feb 19, 2026Updated 2 weeks ago
- MatFormer repo☆72Dec 9, 2024Updated last year
- Ubiflux Vigor ventilation system RS485 Modbus communications with Python☆11Feb 20, 2026Updated 2 weeks ago
- A blueprint for AI development, focusing on applied examples of RAG, information extraction, analysis and fine-tuning in the age of LLMs …☆63Feb 6, 2025Updated last year
- AI Energy Score: Initiative to establish comparable energy efficiency ratings for AI models.☆38Dec 2, 2025Updated 3 months ago
- Gradio UI for a Cog API☆70Apr 8, 2024Updated last year
- Data extraction with LLM on CPU☆112Jan 8, 2024Updated 2 years ago
- Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, imp…☆206Aug 31, 2024Updated last year
- ☆31Dec 13, 2023Updated 2 years ago
- Command Line Interface for Hugging Face Inference Endpoints☆65Apr 10, 2024Updated last year
- This repository contains the code for the Form-Context Model and its Attentive Mimicking variant.☆31May 11, 2020Updated 5 years ago
- Code and models for the paper titled "Better Feature Integration for Named Entity Recognition", NAACL 2021.☆30Nov 5, 2021Updated 4 years ago
- benchmarks for LLM tokenizers☆17Feb 27, 2026Updated last week
- OpenAI GPT hosted Agent Framework for Windows and MacOS☆36Jul 8, 2024Updated last year
- Using open source LLMs to build synthetic datasets for direct preference optimization☆72Feb 29, 2024Updated 2 years ago
- awesome synthetic (text) datasets☆325Jan 8, 2026Updated last month