neuml / magnitudeLinks
Magnitude fork that only supports Word2Vec, GloVe and fastText embeddings
☆13Updated 4 years ago
Alternatives and similar repositories for magnitude
Users that are interested in magnitude are comparing it to the libraries listed below
Sorting:
- 🌸 Train floret vectors☆18Updated 2 years ago
- ☆30Updated 3 years ago
- ☆42Updated 2 years ago
- 🦀 A Rust implementation of a RoBERTa classification model for the SNLI dataset☆13Updated 3 years ago
- spaCy entry points for Curated Transformers☆31Updated last month
- A utility for labeling clusters of text data.☆28Updated 3 years ago
- Benchmark for Japanese document embedding & vector search☆29Updated last year
- ☆70Updated 2 years ago
- 🧬 A VS Code extension for annotating data with Prodigy☆30Updated 3 years ago
- A Python implementation of the SimString, a simple and efficient algorithm for approximate string matching.☆124Updated last year
- FAST is an annotation tool that focuses on mobile devices. https://aclanthology.org/2021.emnlp-demo.41/☆53Updated 3 years ago
- Generate a SQLite database from Wikipedia & Wikidata dumps.☆35Updated last year
- Efficient BM25 with DuckDB 🦆☆49Updated 6 months ago
- spaCy match and replace, maintaining conjugation☆35Updated 2 years ago
- 💫 A spaCy package for Yohei Tamura's Rust tokenizations library☆29Updated 3 weeks ago
- 🔎 A Prodigy plugin for evaluating spaCy pipelines☆13Updated last year
- Generate reports for spaCy models.☆29Updated 3 years ago
- Sentence Embedding as a Service☆15Updated last year
- A file utility for accessing both local and remote files through a unified interface.☆42Updated last month
- 🌸 fastText + Bloom embeddings for compact, full-coverage vectors with spaCy☆313Updated 2 months ago
- History of Open-Source IR Systems☆11Updated 5 months ago
- 🍏 Make Thinc faster on macOS by calling into Apple's native Accelerate library☆96Updated 8 months ago
- Code for "Word Tour: One-dimensional Word Embeddings via the Traveling Salesman Problem" (NAACL 2022)☆101Updated last month
- Production-grade embedding generation, for any length of text, for transformer models.☆23Updated 2 weeks ago
- ☆42Updated last year
- DEPRECATED--all functionality moved to nbdev☆15Updated 2 years ago
- Vespa application making an index of the CORD-19 dataset.☆39Updated 5 months ago
- 🐎 Colt: Effortlessly configure and construct Python objects with colt, a lightweight library inspired by AllenNLP and Tango☆24Updated 2 weeks ago
- Query Segmentation for search☆20Updated 5 years ago
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated 2 years ago