cdimascio / essenceLinks
Automatically extract the main text content (and more) from an HTML document
☆118Updated 3 years ago
Alternatives and similar repositories for essence
Users that are interested in essence are comparing it to the libraries listed below
Sorting:
- Crux offers a flexible plugin-based API & implementation to extract interesting information from Web pages.☆243Updated 10 months ago
- A Kotlin port of Mozilla‘s Readability. It extracts a website‘s relevant content and removes all clutter from it.☆168Updated 4 years ago
- Kotlin/Java library and cli tool for scraping posts and media from various sources with neither authorization nor full page rendering (Fa…☆321Updated last week
- Life and collaboration assistant.☆40Updated this week
- A Natural Language Date Time Parser that Extract date and time from text with context and parse to the required format☆245Updated 2 months ago
- A set of reusable Java components that implement functionality common to any web crawler☆251Updated last week
- A java annotation library for Web scraping.☆28Updated 7 months ago
- Google Search Results JAVA API via SerpApi☆46Updated 7 months ago
- Java library to extract links (URLs, email addresses) from plain text; fast, small and smart☆214Updated 7 months ago
- StaticLog - super lightweight static logging for Kotlin, Java and Android☆29Updated 8 years ago
- A Java library to determine probability of objects being similar.☆257Updated last month
- Article extraction benchmark: dataset and evaluation scripts☆351Updated 4 months ago
- SimpleDNN is a machine learning lightweight open-source library written in Kotlin designed to support relevant neural network architectur…☆101Updated 5 years ago
- Java client for txtai☆40Updated last week
- Matrix-based News Aggregation to Explore Media Bias☆20Updated 7 years ago
- A natural language event parser for java and android.☆103Updated 5 years ago
- The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike☆790Updated 10 months ago
- extJWNL (Extended Java WordNet Library) is a Java API for creating, reading and updating dictionaries in WordNet format.☆131Updated last year
- A Java library for the Giphy API.☆29Updated 8 years ago
- A simple Java library for reading RSS and Atom feeds☆190Updated this week
- A dataset of multinational first names and last names☆27Updated 2 years ago
- Fuzzy Regular Expressions for Java☆26Updated 10 years ago
- Bindings to Google's Compact Language Detector 3 to JVM Based Languages☆21Updated last year
- Multiplatform Kotlin Hello World (Android/iOS/Java/JavaScript/Native)☆79Updated 2 months ago
- Plugin for IntelliJ IDEs to track and record user activity☆74Updated last year
- An implementation of Go-Links, written in Kotlin☆40Updated 10 months ago
- Java port of SymSpell: 1 million times faster through Symmetric Delete spelling correction algorithm☆67Updated 6 months ago
- Simple OAuth 2.0 client written in Kotlin☆25Updated 8 years ago
- A Directory of Online Newspaper Sources for 70+ Languages☆31Updated 4 years ago
- News crawling with StormCrawler - stores content as WARC☆363Updated 11 months ago