gdamdam / sumoLinks
Tool to extracts the text from a web article urls and get frequency words, entities recognition, automatic summary and more
☆20Updated 6 years ago
Alternatives and similar repositories for sumo
Users that are interested in sumo are comparing it to the libraries listed below
Sorting:
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆46Updated 7 years ago
- TTS Client for Coqui TTS server☆13Updated 2 years ago
- A free dataset of (almost) all publicly available podcasts.☆134Updated 11 years ago
- Quickly turn command-line applications into RESTful webservices with a web-application front-end. You provide a specification of your com…☆132Updated 6 months ago
- WordNet Domains, WordNet Affect and SentiWords☆47Updated 9 years ago
- Remove duplicate documents/videos/images via popular algorithms such as SimHash, SpotSig, Shingling, etc.☆18Updated 2 years ago
- Generate time-lapse video for a website☆22Updated 3 years ago
- LLM plugin for embeddings using sentence-transformers☆71Updated 4 months ago
- Render tweet into beautiful markdown☆26Updated last week
- Identifying individual speakers in an audio stream based on the unique characteristics found in individual voices using Python☆18Updated 2 years ago
- A company/project name generator for Python. Uses NLTK and diverse techniques derived from existing corporate etymologies and naming agen…☆50Updated 8 years ago
- Coqui STT Model Manager - install, manage and try out Coqui STT models from the Model Zoo☆25Updated 2 years ago
- Experiment in automatic insertion of timed transcript corrections☆21Updated 7 years ago
- A platform for collecting, analyzing, and visualizing social media data.☆12Updated 4 years ago
- Searching for the occurrence seconds of words/phrases or arbitrary regex patterns within audio files☆102Updated 4 years ago
- Semanlink is a personal information management system based on RDF. It lets you add tags, as well as other RDF metadata, to files, bookma…☆18Updated 8 months ago
- Matrix-based News Aggregation to Explore Media Bias☆20Updated 7 years ago
- Domain-specific language for extracting structured data from HTML documents☆54Updated 4 months ago
- automate incrementally producing word pronunciation recordings for Wiktionary through Wikimedia Commons☆22Updated 7 years ago
- 🦁 Nala is an agile open-source voice assistant framework (20+ actions).☆35Updated 2 years ago
- Write Like Hemingway☆12Updated 10 years ago
- Get an answer to a question from multiple backend engine like Google, wolframalpha or DuckDuckGo☆11Updated 4 years ago
- Experiments with Hugging Face 🔬 🤗☆44Updated last year
- Crawl sites for RSS, Atom, and JSON feeds.☆79Updated last month
- generate rules from lists of words☆16Updated 4 years ago
- Collector and speech cutter for librivox audiobooks☆24Updated 2 years ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆53Updated 4 years ago
- Reproducing "Writing with Transformer" demo, using aitextgen/FastAPI in backend, Quill/React in frontend☆28Updated 4 years ago
- Functional composable pipelines allowing clean separation of the business logic and its implementation☆11Updated 2 weeks ago
- Tool for managing data-deduplication within extant compressed archive files, along with a relatively performant BK tree implementation fo…☆104Updated 2 years ago