gdamdam / sumo
Tool to extracts the text from a web article urls and get frequency words, entities recognition, automatic summary and more
☆20Updated 6 years ago
Alternatives and similar repositories for sumo:
Users that are interested in sumo are comparing it to the libraries listed below
- A POC for a tsdb storage using parquet☆66Updated this week
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆46Updated 7 years ago
- A more advanced utility to generate a large collection of unique images along with metadata.☆5Updated 2 years ago
- Example how to pre-process news articles with textbox and index on Elastic Search☆13Updated 7 years ago
- Traptor -- A distributed Twitter feed☆26Updated 2 years ago
- Libraries and tools that allow you to programmatically create, modify, or convert images☆21Updated 4 years ago
- Discussion Summarization is the process of condensing a text document which is a collection of discussion threads, using CBS (Cluster Bas…☆12Updated 11 years ago
- Access predefined IMAP mailboxes with a browser using one time passwords or a YubiKey☆17Updated 8 years ago
- email dataset for email signature parsing☆55Updated 8 years ago
- Render tweet into beautiful markdown☆25Updated last week
- Maps clauses from a text corpus onto the metrical structure of a poem☆17Updated 9 years ago
- Run embedding models using ONNX☆32Updated last year
- ☆13Updated 6 years ago
- Markdown -> IPython conversion tool☆15Updated 10 years ago
- Paginating the web☆37Updated 11 years ago
- Repository to allow collaboration between Cycle Labs Cloud community in support of the community.☆9Updated 3 years ago
- Pipeline for distributed Natural Language Processing, made in Python☆64Updated 8 years ago
- API server for NLTK☆23Updated 8 years ago
- A Python canonicalizer to disambiguate and recognize known names from a poor quality data entry list.☆20Updated 9 years ago
- A raspberry pi 64bit image with spacy and neuralcoref pre-installed☆21Updated 5 years ago
- The Requests Stampede library is a wrapper around the Requests library that provides request retry logic and backoff delays.☆10Updated 4 years ago
- Integration between Reaction ECommerce and Accelerated Text to provide product descriptions for an e-shop.☆12Updated 4 years ago
- Simple and clean Python implementation of TextRank as per seminal paper by Rada Mihalcea and Paul Tarau. This implementation performs bot…☆11Updated 4 years ago
- A collection of pre-built speech synthesis settings used to convey emotion☆11Updated 5 years ago
- WebAnnotator is a tool for annotating Web pages. WebAnnotator is implemented as a Firefox extension (https://addons.mozilla.org/en-US/fi…☆48Updated 3 years ago
- Python package for converting xml and epubs to text files☆34Updated 4 years ago
- Character Vomiting☆10Updated 7 years ago
- Functional composable pipelines allowing clean separation of the business logic and its implementation☆11Updated 11 months ago
- A company/project name generator for Python. Uses NLTK and diverse techniques derived from existing corporate etymologies and naming agen…☆49Updated 8 years ago
- A CLI tool for managing OpenAI batch processing jobs with ease.☆35Updated last week