Plug-and-play document AI with zero-shot models.
☆126May 11, 2026Updated 3 weeks ago
Alternatives and similar repositories for sieves
Users that are interested in sieves are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Lightweight piece tokenization library☆12Apr 15, 2024Updated 2 years ago
- Modular Rust transformer/LLM library using Candle☆39May 5, 2024Updated 2 years ago
- This repository is designed for deploying and managing server processes that handle embeddings using the Infinity Embedding model or Larg…☆26Mar 6, 2025Updated last year
- FlexiTokens☆23Dec 27, 2025Updated 5 months ago
- spaCy entry points for Curated Transformers☆32Mar 27, 2026Updated 2 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Efficient BM25 with DuckDB 🦆☆68Dec 20, 2024Updated last year
- A blueprint for AI development, focusing on applied examples of RAG, information extraction, analysis and fine-tuning in the age of LLMs …☆65Feb 6, 2025Updated last year
- Next-generation Punkt sentence boundary detection with zero dependencies☆31Nov 18, 2025Updated 6 months ago
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆89Feb 10, 2026Updated 3 months ago
- CMU Linguistic Annotation Backend☆15Sep 22, 2025Updated 8 months ago
- Load embeddings and featurize your sentences.☆31Oct 23, 2024Updated last year
- Read and modify constituency trees in Rust.☆10May 5, 2020Updated 6 years ago
- Parent repository for the MOJ Analytics Platform☆14Nov 16, 2021Updated 4 years ago
- 🔢 Work with static vector models☆39Apr 21, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Bagpipes spaCy is a collection of custom spaCy pipeline components designed to enhance text processing capabilities.☆21Aug 15, 2024Updated last year
- Customize, control, and enhance LLM generation with logits processors, featuring visualization capabilities to inspect and understand sta…☆47Jan 8, 2026Updated 5 months ago
- A curated list of materials on AI guardrails☆55Jun 3, 2025Updated last year
- Dataset Viber is your chill repo for data collection, annotation and vibe checks.☆47Sep 5, 2024Updated last year
- Synthetic Text Dataset Generation for LLM projects☆58May 27, 2026Updated last week
- GLiNER inference in JavaScript☆27Mar 2, 2025Updated last year
- Evaluation framework for document processing models and services.☆75May 28, 2026Updated last week
- ☆69Mar 17, 2022Updated 4 years ago
- ☆15May 8, 2019Updated 7 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- C inference engine for running GLiClass (Generalist and Lightweight Classification) models☆17May 21, 2025Updated last year
- KL3M training data collection and preprocessing☆22Apr 14, 2025Updated last year
- ☆23Jan 2, 2023Updated 3 years ago
- simple grpo☆12May 28, 2025Updated last year
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆156May 24, 2024Updated 2 years ago
- 🧪 Cutting-edge experimental spaCy components and features☆105Apr 23, 2024Updated 2 years ago
- The privacy-preserving record linkage toolkit: a proof-of-concept public demo of next-gen data linkage techniques.☆16May 22, 2024Updated 2 years ago
- Generate Python data structures and XML parser from Xschema (Python 3 port)☆12Jan 13, 2015Updated 11 years ago
- 🦦 weasel: A small and easy workflow system☆94Mar 27, 2026Updated 2 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Nearly Inference Free Embeddings: make your RAG queries 500x faster☆77Apr 27, 2026Updated last month
- Getting interpretable dimensions in word embedding spaces.☆15Jul 6, 2023Updated 2 years ago
- ☆29Jun 23, 2022Updated 3 years ago
- ☆10Oct 22, 2024Updated last year
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆23Mar 12, 2024Updated 2 years ago
- Python driver for MobilityDB☆11May 1, 2026Updated last month
- Legal Matter Standard Specification (LMSS) library for Python☆17Nov 14, 2023Updated 2 years ago