cdimascio / essenceLinks
Automatically extract the main text content (and more) from an HTML document
☆118Updated 3 years ago
Alternatives and similar repositories for essence
Users that are interested in essence are comparing it to the libraries listed below
Sorting:
- Crux offers a flexible plugin-based API & implementation to extract interesting information from Web pages.☆243Updated 6 months ago
- A Natural Language Date Time Parser that Extract date and time from text with context and parse to the required format☆242Updated last year
- Kotlin/Java library and cli tool for scraping posts and media from various sources with neither authorization nor full page rendering (Fa…☆301Updated last week
- A Kotlin port of Mozilla‘s Readability. It extracts a website‘s relevant content and removes all clutter from it.☆164Updated 3 years ago
- Life and collaboration assistant.☆36Updated last week
- A set of reusable Java components that implement functionality common to any web crawler☆246Updated 3 weeks ago
- Java library to extract links (URLs, email addresses) from plain text; fast, small and smart☆212Updated 4 months ago
- The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike☆774Updated 6 months ago
- A fork of Dragnet that also extract author, headline, date, keywords from context, as well as built in metadata extraction all in one pac…☆294Updated 4 months ago
- A language detection Web Service☆53Updated 8 years ago
- StaticLog - super lightweight static logging for Kotlin, Java and Android☆29Updated 7 years ago
- Article extraction benchmark: dataset and evaluation scripts☆334Updated 3 weeks ago
- extJWNL (Extended Java WordNet Library) is a Java API for creating, reading and updating dictionaries in WordNet format.☆130Updated last year
- The LAW next generation crawler.☆88Updated 3 years ago
- A simple Java library for reading RSS and Atom feeds☆184Updated this week
- SimpleDNN is a machine learning lightweight open-source library written in Kotlin designed to support relevant neural network architectur…☆101Updated 5 years ago
- Index Common Crawl archives in tabular format☆122Updated 2 months ago
- An implementation of Go-Links, written in Kotlin☆39Updated 7 months ago
- A Kotlin/Java API for generating .ts source files.☆49Updated last year
- Java client for txtai☆38Updated last month
- An overview of the AI-as-a-service landscape☆159Updated 7 years ago
- Kotlin client for JetBrains Space HTTP API☆48Updated 9 months ago
- Multiplatform Kotlin Hello World (Android/iOS/Java/JavaScript/Native)☆78Updated last year
- Readability clone in Java☆460Updated 5 years ago
- Generate zod schemas from Kotlin data classes.☆17Updated 2 months ago
- A pure Java implementation of the Zeroconf technology.☆13Updated 2 years ago
- A human-friendly alternative to cron. Designed after GAE's schedule for Kotlin and/or Java 8+.☆83Updated 3 years ago
- News crawling with StormCrawler - stores content as WARC☆356Updated 7 months ago
- A web crawling framework written in Kotlin☆131Updated 4 years ago
- Simple embedded NLU for mobile apps☆70Updated 7 years ago