Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends
☆57Jan 28, 2024Updated 2 years ago
Alternatives and similar repositories for KeywordAnalysis
Users that are interested in KeywordAnalysis are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Keyword Extraction system using Brown Clustering - (This version is trained to extract keywords from job listings)☆18Sep 16, 2014Updated 11 years ago
- API - extract a list of keywords from a text.☆18Jul 6, 2017Updated 8 years ago
- LinkRun - Data Engineering project done in 3 weeks during the Insight fellowship☆38Apr 2, 2020Updated 6 years ago
- An attempt to use financial news to predict stock market☆16Nov 17, 2018Updated 7 years ago
- Simple multi threaded tool to extract domain related data from commoncrawl.org☆31Jul 17, 2018Updated 7 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Corpus of domain names scraped from Common Crawl and manually annotated to add word boundaries (e.g. "commoncrawl" to "common crawl").☆20Jun 16, 2025Updated 10 months ago
- Generates the most important key-phrase/key-words from a document based on a corpus☆10Jun 17, 2024Updated last year
- Experimental AGS data fotmat tool in python☆12Oct 17, 2018Updated 7 years ago
- Extraction code used to create the Dresden Web Table Corpus☆14Feb 25, 2015Updated 11 years ago
- Python script to split PDF files into separate files based on bookmarks☆15Jan 21, 2022Updated 4 years ago
- Automatic tagging and analysis of documents in an Apache Solr index for faceted search by RDF(S) Ontologies & SKOS thesauri☆47Jan 16, 2022Updated 4 years ago
- ☆15Aug 15, 2012Updated 13 years ago
- ☆19Dec 19, 2018Updated 7 years ago
- Here are all of the PowerPoint presentations that I have ever created and presented.☆12Dec 28, 2020Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Community driven landing page generator for open source projects☆15Jan 25, 2016Updated 10 years ago
- Tools to construct and process Common Crawl webgraphs☆108Updated this week
- ☆16Jul 31, 2020Updated 5 years ago
- Multi-Entity Extraction Framework for Academic Documents (with default extraction tools)☆31Oct 3, 2023Updated 2 years ago
- Wikipedia-based keyword extraction tool in Java☆22May 11, 2015Updated 10 years ago
- This Python code scrapes Google search results then applies sentiment analysis, generates text summaries, and ranks keywords.☆29Feb 14, 2021Updated 5 years ago
- Language models are open knowledge graphs ( non official implementation )☆13Jan 17, 2021Updated 5 years ago
- ☆14Sep 22, 2016Updated 9 years ago
- An entity linking prototype, developed using the datasets from the TAC-KBP sub-task.☆27Apr 5, 2017Updated 9 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Process Common Crawl data with Python and Spark☆454Mar 26, 2026Updated last month
- Common crawl extractor☆83May 21, 2024Updated last year
- A helpful package that helps you access shell & shell-based applications via web application☆16Jul 25, 2023Updated 2 years ago
- Joyner Document Format 2.0 (JDF) LaTeX Template☆14Jun 2, 2019Updated 6 years ago
- EmbedRank implemented in Python.☆15Jun 17, 2024Updated last year
- European Parliament website Python scraper☆12Oct 19, 2016Updated 9 years ago
- Code for the paper "Benchmarking sentiment analysis methods for large-scale texts: A case for using continuum-scored words and word shift…☆16Jun 8, 2017Updated 8 years ago
- A Python library for variable type checker/validator/converter at a run time.☆17Updated this week
- Spark/Cassandra/Akka combo to visualize a cloud of words using d3.js☆11Dec 6, 2015Updated 10 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- seq2seq based keyphrase generation model sets, including copyrnn copycnn and copytransfomer☆50Feb 7, 2022Updated 4 years ago
- Extract data from an HTML table and store results to a csv file.☆38Oct 2, 2015Updated 10 years ago
- Semantic Parser with Execution☆13Dec 8, 2017Updated 8 years ago
- test☆23Nov 11, 2020Updated 5 years ago
- create concept map from textbook data☆11May 4, 2018Updated 8 years ago
- A Supervised Approach To The Interpretation Of Imperative To-Do Lists☆12Jun 29, 2018Updated 7 years ago
- Statistics of Common Crawl monthly archives mined from URL index files☆219Apr 27, 2026Updated last week