diasks2 / pragmatic_segmenter
Pragmatic Segmenter is a rule-based sentence boundary detection gem that works out-of-the-box across many languages.
☆567Updated 8 months ago
Alternatives and similar repositories for pragmatic_segmenter:
Users that are interested in pragmatic_segmenter are comparing it to the libraries listed below
- Lemmatizer for text in English. Inspired by Python's nltk.corpus.reader.wordnet.morphy☆108Updated 3 years ago
- Automatically exported from code.google.com/p/universal-pos-tags☆129Updated 2 years ago
- Ruby bindings to the Stanford Core NLP tools (English, French, German).☆436Updated 5 years ago
- Calculate similarity between documents using TF-IDF weights☆116Updated 4 months ago
- Find a needle (a document or record) in a haystack using string similarity and (optionally) regular expression rules. Uses Dice's Coeffic…☆680Updated 3 years ago
- Lexical database of any language☆179Updated 2 years ago
- A language detection library for Ruby that uses bloom filters for speed.☆685Updated 2 years ago
- A general classifier module to allow Bayesian and other types of classifications. A fork of cardmagic/classifier.☆557Updated 10 months ago
- Wikipedia information extraction library☆175Updated last year
- Read text and metadata from files and documents (.doc, .docx, .pages, .odt, .rtf, .pdf)☆502Updated 2 years ago
- Retrofitting Word Vectors to Semantic Lexicons☆375Updated 6 years ago
- A sentence segmenter that actually works!☆305Updated 4 years ago
- Multilingual word vectors in 78 languages☆1,196Updated 2 years ago
- A collection of links to Ruby Natural Language Processing (NLP) libraries, tools and software☆1,280Updated 2 years ago
- A multilingual tokenizer to split a string into tokens☆91Updated 8 months ago
- SemCor and Masc documents annotated with NOAD word senses.☆183Updated 5 years ago
- Approximate String Matching library☆380Updated 3 months ago
- English Part-of-Speech Tagger Library; a Ruby port of Lingua::Tagger☆271Updated 3 months ago
- SymSpellCompound: compound aware automatic spelling correction☆66Updated 7 years ago
- Official version of TextTeaser.☆624Updated 6 years ago
- Natural language processing framework for Ruby.☆1,369Updated 7 years ago
- ☆818Updated last year
- Ruby Binding for Stanford Pos-Tagger and Name Entity Recognizer☆92Updated 10 years ago
- German Morphological Analyzer☆47Updated 3 years ago
- Language independent truecaser in Python.☆160Updated 3 years ago
- A pure Ruby interface to the WordNet database☆90Updated 5 years ago
- Anafora is a web-based raw text annotation tool☆240Updated 2 years ago
- Comprehensive data proxy to knowledge about real world☆818Updated 2 years ago
- Various utilities for processing the data.☆208Updated this week
- Language Detection with Infinity-gram☆230Updated 9 years ago