gdamdam / sumoLinks
Tool to extracts the text from a web article urls and get frequency words, entities recognition, automatic summary and more
☆20Updated 6 years ago
Alternatives and similar repositories for sumo
Users that are interested in sumo are comparing it to the libraries listed below
Sorting:
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆46Updated 7 years ago
- Quickly turn command-line applications into RESTful webservices with a web-application front-end. You provide a specification of your com…☆133Updated 7 months ago
- Get an answer to a question from multiple backend engine like Google, wolframalpha or DuckDuckGo☆11Updated 4 years ago
- A company/project name generator for Python. Uses NLTK and diverse techniques derived from existing corporate etymologies and naming agen…☆50Updated 8 years ago
- It finds best synonyms from Google Books when you press a hotkey☆30Updated 10 years ago
- A free dataset of (almost) all publicly available podcasts.☆134Updated 11 years ago
- Remove duplicate documents/videos/images via popular algorithms such as SimHash, SpotSig, Shingling, etc.☆18Updated 2 years ago
- A platform for collecting, analyzing, and visualizing social media data.☆12Updated 4 years ago
- Crawl Wikipedia pages and upload TTS to Youtube.☆10Updated 6 months ago
- LLM plugin for embeddings using sentence-transformers☆72Updated 5 months ago
- WordNet Domains, WordNet Affect and SentiWords☆48Updated 9 years ago
- Suite of tools for detecting changes in web pages and their rendering☆55Updated last year
- Import your genome into a SQLite database☆24Updated 6 years ago
- Crawl sites for RSS, Atom, and JSON feeds.☆81Updated this week
- Matrix-based News Aggregation to Explore Media Bias☆19Updated 7 years ago
- An interface for interacting with MediaWiki☆37Updated 3 years ago
- Manage, generate convert chapters for podcasts and other media via cli and web☆37Updated 6 months ago
- Discussion Summarization is the process of condensing a text document which is a collection of discussion threads, using CBS (Cluster Bas…☆12Updated 11 years ago
- Pocketsphinx-based Linux Voice Dictation☆25Updated 5 years ago
- Maps clauses from a text corpus onto the metrical structure of a poem☆17Updated 10 years ago
- Quantified Self: A Personal Data Aggregator and Dashboard for Self-Trackers and Quantified Self Enthusiasts☆18Updated 2 years ago
- Cleaning tool for web scraped text☆38Updated 2 years ago
- Faster, modernized fork of the language identification tool langid.py☆59Updated 10 months ago
- LLM plugin for clustering embeddings☆82Updated last year
- Wikipedia Live Monitor☆22Updated 9 months ago
- Simple and clean Python implementation of TextRank as per seminal paper by Rada Mihalcea and Paul Tarau. This implementation performs bot…☆11Updated 4 years ago
- ☆10Updated 7 months ago
- GPT2Explorer is bringing GPT2 OpenAI langage models playground to run locally on standard windows computers.☆28Updated 3 years ago
- Presentations on Quantified Self and Self-Tracking with Python☆31Updated 2 years ago
- An Alexa skill providing a conversational interface to any public figure (as mimicked by GPT3). The legacy GUI is no longer maintained.☆20Updated last year