maxdotio / mighty-batch
Highly concurrent and fast content processing for Mighty Inference Server
☆10Updated 2 years ago
Alternatives and similar repositories for mighty-batch:
Users that are interested in mighty-batch are comparing it to the libraries listed below
- Neural Solr = Solr 9 + Mighty Inference + Node☆16Updated 2 years ago
- Binary vector search example using Unum's USearch engine and pre-computed Wikipedia embeddings from Co:here and MixedBread☆18Updated 10 months ago
- Documentation effort for the BookCorpus dataset☆33Updated 3 years ago
- Search through Facebook Research's PyTorch BigGraph Wikidata-dataset with the Weaviate vector search engine☆31Updated 3 years ago
- utilities for loading and running text embeddings with onnx☆44Updated 6 months ago
- LLM plugin for clustering embeddings☆69Updated last year
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated 2 years ago
- ☆29Updated last year
- ☆30Updated 2 years ago
- Vespa application making an index of the CORD-19 dataset.☆39Updated last month
- Efficiently computing & storing token n-grams from large corpora☆18Updated 4 months ago
- hnsw implemented by python☆19Updated 5 years ago
- spaCy entry points for Curated Transformers☆27Updated 5 months ago
- Rust bindings for CTranslate2☆14Updated last year
- 🤗 HuggingFace Inference Toolkit for Google Cloud Vertex AI (similar to SageMaker's Inference Toolkit, but for Vertex AI and unofficial)☆17Updated 11 months ago
- A CLI tool for managing OpenAI batch processing jobs with ease.☆33Updated 6 months ago
- Library for fast text representation and classification.☆28Updated last year
- A proposed standard `NOCK` for a Parquet format that supports efficient distributed serialization of multiple kinds of graph technologies☆19Updated 2 years ago
- ☆19Updated 6 years ago
- Efficient BM25 with DuckDB 🦆☆39Updated 2 months ago
- ☆8Updated 7 months ago
- numpy ufuncs for vector similarity☆14Updated last year
- Code for SaGe subword tokenizer (EACL 2023)☆24Updated 3 months ago
- Testing various image matching algorithms' performance on the Pinecone vector DB☆43Updated last year
- Finds linguistic patterns effortlessly☆35Updated last year
- This is the repo for the container that holds the models for the text2vec-transformers module☆49Updated last month
- Framework for Self-Organizing Python Agents☆29Updated last year
- Showcase how mxbai-embed-large-v1 can be used to produce binary embedding. Binary embeddings enabled 32x storage savings and 40x faster r…☆15Updated 11 months ago
- Official details for: [1803.08493] Context is Everything: Finding Meaning Statistically in Semantic Spaces☆39Updated 5 years ago