opensearch-project / opensearch-py-ml
☆43Updated this week
Alternatives and similar repositories for opensearch-py-ml:
Users that are interested in opensearch-py-ml are comparing it to the libraries listed below
- Entity resolution, also known as Data Matching or Record linkage is the task of finding a data set that refer to the same or similar real…☆23Updated 5 months ago
- Search Request Processor: pipeline for transformation of queries and results inline with a search request.☆23Updated last month
- Batteries included toolkit for data engineering.☆33Updated 2 months ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆56Updated 3 months ago
- Provide an easy way with Python to protect your data sources by searching its metadata. 🛡️☆16Updated this week
- This is the repo for the container that holds the models for the text2vec-transformers module☆49Updated last month
- scraping and querying documents for LLMs☆18Updated 2 months ago
- spaCy entry points for Curated Transformers☆27Updated 5 months ago
- Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully custom…☆44Updated 8 months ago
- Async bulk data ingestion and querying in various document, graph and vector databases via their Python clients☆36Updated last year
- Graph Engine for Exploration and Search☆40Updated last year
- Neural Solr = Solr 9 + Mighty Inference + Node☆17Updated 2 years ago
- User Behavior Insights standard schema specification☆26Updated 2 weeks ago
- Prefect integrations for working with OpenAI.☆35Updated 10 months ago
- Fork of https://github.com/o19s/elasticsearch-learning-to-rank to work with OpenSearch☆16Updated this week
- Examples of vector DB indexing and query with various vector databases.☆12Updated last month
- 🔎 A Prodigy plugin for evaluating spaCy pipelines☆13Updated 11 months ago
- A curated list of awesome open source tools and commercial products to catalog, version, and manage data 🚀☆32Updated 2 years ago
- MirrorDataGenerator is a python tool that generates synthetic data based on user-specified causal relations among features in the data. I…☆21Updated 2 years ago
- portable Python ML-powered data bot☆23Updated 5 months ago
- Awesome Orchest projects, both official and submitted by the community.☆25Updated last year
- Magniv Core - A Python-decorator based job orchestration platform. Avoid responsibility handoffs by abstracting infra and DevOps.☆78Updated 8 months ago
- Repo to experiment with Graph RAG strategies using Kùzu☆49Updated 3 months ago
- Benchmark study on LanceDB, an embedded vector DB, for full-text search and vector search☆23Updated last year
- LLM application tracing based on OpenTelemetry☆10Updated last month
- A framework for simulating e-commerce data and interactions that can be used to build recommendation systems☆10Updated last year
- Leverage your LangChain trace data for fine tuning☆41Updated 7 months ago
- Aim-spaCy integration☆34Updated last year
- Language detection using Spacy and Fasttext☆55Updated last year