gdamdam / sumoLinks
Tool to extracts the text from a web article urls and get frequency words, entities recognition, automatic summary and more
☆20Updated 6 years ago
Alternatives and similar repositories for sumo
Users that are interested in sumo are comparing it to the libraries listed below
Sorting:
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆46Updated 7 years ago
- A more advanced utility to generate a large collection of unique images along with metadata.☆5Updated 2 years ago
- Example how to pre-process news articles with textbox and index on Elastic Search☆13Updated 7 years ago
- Traptor -- A distributed Twitter feed☆26Updated 2 years ago
- It finds best synonyms from Google Books when you press a hotkey☆30Updated 10 years ago
- The Requests Stampede library is a wrapper around the Requests library that provides request retry logic and backoff delays.☆10Updated 4 years ago
- A CLI tool for managing OpenAI batch processing jobs with ease.☆36Updated last month
- Functional composable pipelines allowing clean separation of the business logic and its implementation☆11Updated last year
- Discussion Summarization is the process of condensing a text document which is a collection of discussion threads, using CBS (Cluster Bas…☆12Updated 11 years ago
- Pixel-mangling scripts for the command line.☆30Updated 5 months ago
- A company/project name generator for Python. Uses NLTK and diverse techniques derived from existing corporate etymologies and naming agen…☆49Updated 8 years ago
- Maps clauses from a text corpus onto the metrical structure of a poem☆17Updated 9 years ago
- Scraping Amazon reviews using headless chrome and selenium☆10Updated 6 years ago
- An easy-to-use and highly customizable crawler that enables you to create your own little Web archives (WARC/CDX)☆25Updated 7 years ago
- A workflow system for Natural Language Processing.☆21Updated 5 years ago
- Pipeline for distributed Natural Language Processing, made in Python☆65Updated 8 years ago
- RESTful API around the PETRARCH coding software☆10Updated 4 years ago
- Turn a doc into plaintext which you can listen to using TTS☆19Updated 2 years ago
- ☆15Updated 8 years ago
- Feet is a tool for extracting entities from a text according to dictionaries.☆11Updated 8 years ago
- A collection of prompts for use with the LLM CLI tool☆16Updated last year
- A [personal]<-[notebook]->[network]. Complete with custom numerics for constrained Gaussian gravitation physics.☆22Updated 3 years ago
- Mad (╯°□°)╯'ing☆10Updated 2 years ago
- search, dedupe, and media ingestion for mediachain☆33Updated 8 years ago
- Custom Python functions for working with SQLite FTS4☆22Updated 2 years ago
- Repository to allow collaboration between Cycle Labs Cloud community in support of the community.☆9Updated 3 years ago
- Generative tree visualiser for Python☆16Updated 4 years ago
- The little things give you away... A collection of various small helper stuff – Mirror repo only, no longer kept in sync, refer to gitea.…☆24Updated 4 years ago
- Construct your personal API☆18Updated 2 years ago
- Visegrad+ Parliament API. Access to parliament data of Visegrad+ countries in a common data standard.☆11Updated 9 years ago