neuml / magnitudeLinks
Magnitude fork that only supports Word2Vec, GloVe and fastText embeddings
β13Updated 5 years ago
Alternatives and similar repositories for magnitude
Users that are interested in magnitude are comparing it to the libraries listed below
Sorting:
- A Python implementation of the SimString, a simple and efficient algorithm for approximate string matching.β124Updated 2 years ago
- πΈ Train floret vectorsβ18Updated 2 years ago
- β30Updated 3 years ago
- πΈ fastText + Bloom embeddings for compact, full-coverage vectors with spaCyβ326Updated 6 months ago
- Benchmark for Japanese document embedding & vector searchβ29Updated last year
- β‘οΈ AllenNLP plugin for adding subcommands to use Optuna, making hyperparameter optimization easyβ32Updated 3 years ago
- Language detection using Spacy and Fasttextβ57Updated last year
- π Make Thinc faster on macOS by calling into Apple's native Accelerate libraryβ101Updated 4 months ago
- Camphr - NLP libary for creating pipeline componentsβ338Updated 2 years ago
- π« A spaCy package for Yohei Tamura's Rust tokenizations libraryβ33Updated 5 months ago
- π¦ A Rust implementation of a RoBERTa classification model for the SNLI datasetβ13Updated 4 years ago
- π Additional lookup tables and data resources for spaCyβ112Updated 5 months ago
- β70Updated 2 years ago
- Sentence transformers models for SpaCyβ109Updated 2 years ago
- β44Updated 2 years ago
- Use custom tokenizers in spacy-transformersβ16Updated 3 years ago
- π§ͺ Cutting-edge experimental spaCy components and featuresβ103Updated last year
- Source code for the paper "Web2Text: Deep Structured Boilerplate Removal", full paper @ ECIR'18β170Updated 4 years ago
- Use ML-Annotate to label data for machine learning purposesβ110Updated 5 years ago
- Yet another sentence-level tokenizer for the Japanese textβ22Updated 3 years ago
- π’ Work with static vector modelsβ34Updated 6 months ago
- π₯ Use Hugging Face text and token classification pipelines directly in spaCyβ63Updated last year
- β69Updated 3 years ago
- βοΈ Parallel and distributed training with spaCy and Rayβ56Updated 2 years ago
- Code for "Word Tour: One-dimensional Word Embeddings via the Traveling Salesman Problem" (NAACL 2022)β108Updated 6 months ago
- You can create datasets from Wikia/Wikipedia that can be used for entity recognition and Entity Linking. Dumps for ja-wiki and VTuber-wikβ¦β17Updated 4 years ago
- A PyTorch-based open-source framework that provides methods for improving the weakly annotated data and allows researchers to efficientlyβ¦β108Updated last year
- Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further langβ¦β128Updated last year
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidataβ168Updated 3 years ago
- Utility scripts for preprocessing Wikipedia texts for NLPβ78Updated last year