rai-project / go-fasttextLinks
☆14Updated 7 years ago
Alternatives and similar repositories for go-fasttext
Users that are interested in go-fasttext are comparing it to the libraries listed below
Sorting:
- Go Bindings for BERT NLP Models☆105Updated 6 years ago
- Golang binding for facebook fastText☆13Updated 8 years ago
- LASER multilingual sentence embeddings as a pip package☆224Updated 2 years ago
- An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.☆86Updated 4 years ago
- Corpus preprocessing☆98Updated last year
- A word2vec negative sampling implementation with correct CBOW update.☆260Updated 3 years ago
- Source code for the Apple reproduction☆32Updated 4 years ago
- Demonstration of the results in "Text Normalization using Memory Augmented Neural Networks", Authors: Subhojeet Pramanik, Aman Hussain☆60Updated 6 years ago
- Accelerated NLP pipelines for fast inference on CPU. Built with Transformers and ONNX runtime.☆127Updated 4 years ago
- Tool to fix bitexts and tag near-duplicates for removal☆31Updated last week
- xfspell — the Transformer Spell Checker☆190Updated 5 years ago
- go-corenlp is a Golang wrapper for Stanford CoreNLP.☆30Updated 6 years ago
- Python package for lexicon; Trie and DAWG implementation.☆55Updated 9 months ago
- Word Embeddings in Go!☆502Updated 2 years ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆253Updated 2 years ago
- COMBO is jointly trained tagger, lemmatizer and dependency parser.☆35Updated 2 years ago
- A multilingual command line sentence tokenizer in Golang☆456Updated last year
- Text tokenization and sentence segmentation (segtok v2)☆206Updated 3 years ago
- Lightning Fast Language Prediction 🚀☆167Updated 3 weeks ago
- A python true casing utility that restores case information for texts☆89Updated 2 years ago
- Segtok v2 is here: https://github.com/fnl/syntok -- A rule-based sentence segmenter (splitter) and a word tokenizer using orthographic fe…☆170Updated 3 years ago
- ☆42Updated 7 years ago
- A tiny BERT for low-resource monolingual models☆31Updated 11 months ago
- A library for efficient similarity search and clustering of dense vectors. It's a Go wrapper of faiss (https://github.com/facebookresearc…☆24Updated 2 years ago
- code and data used to build a training dataset for dragnet models☆10Updated 4 years ago
- ☆11Updated 4 years ago
- a contextual, biasable, word-or-sentence-or-paragraph extractive summarizer powered by the latest in text embeddings (Bert, Universal Sen…☆230Updated 2 years ago
- terashuf shuffles multi-terabyte text files using limited memory☆226Updated 2 years ago
- This repository contains source code to binarize any real-value word embeddings into binary vectors.☆47Updated 4 years ago
- ☆173Updated 5 months ago