neuml / txtmarkerLinks
ποΈ Highlight text in documents
β109Updated 2 months ago
Alternatives and similar repositories for txtmarker
Users that are interested in txtmarker are comparing it to the libraries listed below
Sorting:
- Multi-threaded matrix multiplication and cosine similarity calculations for dense and sparse matrices. Appropriate for calculating the K β¦β83Updated 6 months ago
- Python package for deduplication/entity resolution using active learningβ81Updated 10 months ago
- Repository for deepdoctection tutorial notebooksβ45Updated 3 weeks ago
- Generalist and Lightweight Model for Text Classificationβ139Updated last month
- GLiNER model in a FastAPI microservice.β44Updated 7 months ago
- Python API for https://vespa.ai, the open big data serving engineβ127Updated last week
- SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.β59Updated last year
- π Datasets and models for instruction-tuningβ238Updated last year
- Efficient few-shot learning with cross-encoders.β54Updated last year
- A framework for converting natural language text inputs to corresponding Pandas, MongoDB, Kusto and Neo4j (Cypher) queries.β83Updated last year
- Pre-train Static Word Embeddingsβ84Updated last month
- π₯ Use Hugging Face text and token classification pipelines directly in spaCyβ63Updated last year
- Streamlit component for Jina neural searchβ41Updated 3 years ago
- RaKUn 2.0 - A fast keyword detection algorithmβ67Updated 2 months ago
- π Reference-Free automatic summarization evaluation with potential hallucination detectionβ100Updated last year
- π’ Work with static vector modelsβ28Updated 2 months ago
- Versatile framework designed to streamline the integration of your models, as well as those sourced from Hugging Face, into complex progrβ¦β32Updated 3 months ago
- Docutron Toolkit: detection and segmentation analysis for legal data extraction over documents.β26Updated last year
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.β108Updated last year
- OCR, Archive, Index and Search: Implementation agnostic OCR framework.β222Updated last year
- Command Line Interface for Hugging Face Inference Endpointsβ66Updated last year
- Tool to apply Legal Matter Specification Standard (LMSS) to documentsβ13Updated 10 months ago
- Clean, filter and sample URLs to optimize data collection β Python & command-line β Deduplication, spam, content and language filtersβ142Updated 6 months ago
- A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GROβ¦β51Updated 3 months ago
- 𦦠weasel: A small and easy workflow systemβ85Updated last year
- NeatText a simple NLP package for cleaning textual data and text preprocessingβ72Updated last year
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created byβ¦β31Updated 10 months ago
- Python library that allows you to get structured responses in the form of Pydantic models and Python types from Anthropic, Google Vertex β¦β77Updated 11 months ago
- LLM prompt language based on Jinja. Banks provides tools and functions to build prompts text and chat messages from generic blueprints. Iβ¦β105Updated 2 weeks ago
- Aim-spaCy integrationβ34Updated last year