Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends
☆57Jan 28, 2024Updated 2 years ago
Alternatives and similar repositories for KeywordAnalysis
Users that are interested in KeywordAnalysis are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Keyword Extraction system using Brown Clustering - (This version is trained to extract keywords from job listings)☆18Sep 16, 2014Updated 11 years ago
- API - extract a list of keywords from a text.☆18Jul 6, 2017Updated 8 years ago
- Source real estate prices from the Common Crawl.☆27Oct 22, 2018Updated 7 years ago
- LinkRun - Data Engineering project done in 3 weeks during the Insight fellowship☆38Apr 2, 2020Updated 6 years ago
- An attempt to use financial news to predict stock market☆16Nov 17, 2018Updated 7 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Extraction code used to create the Dresden Web Table Corpus☆14Feb 25, 2015Updated 11 years ago
- Python script to split PDF files into separate files based on bookmarks☆15Jan 21, 2022Updated 4 years ago
- Automatic tagging and analysis of documents in an Apache Solr index for faceted search by RDF(S) Ontologies & SKOS thesauri☆46Jan 16, 2022Updated 4 years ago
- A tiny Python clone of https://archive.org/web/ for your own personal websites.☆15Sep 30, 2020Updated 5 years ago
- ☆15Aug 15, 2012Updated 13 years ago
- Detect the text orientation on a page with Tesseract OCR☆14Dec 18, 2020Updated 5 years ago
- Here are all of the PowerPoint presentations that I have ever created and presented.☆12Dec 28, 2020Updated 5 years ago
- Community driven landing page generator for open source projects☆15Jan 25, 2016Updated 10 years ago
- Gathers urls from common crawl☆35Nov 9, 2019Updated 6 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Use scrapy with a list of proxies generated from proxynova.com☆39Jan 3, 2013Updated 13 years ago
- Tools to construct and process Common Crawl webgraphs☆108May 25, 2026Updated 2 weeks ago
- Multi-Entity Extraction Framework for Academic Documents (with default extraction tools)☆31Oct 3, 2023Updated 2 years ago
- Create an Anime database containing all the Anime currently available on the website, which includes: 'Anime Title', 'Description', 'C…☆11Jun 10, 2020Updated 6 years ago
- Code for the paper: Combining Graph Degeneracy and Submodularity for Unsupervised Extractive Summarization☆17Apr 24, 2020Updated 6 years ago
- This Python code scrapes Google search results then applies sentiment analysis, generates text summaries, and ranks keywords.☆28Feb 14, 2021Updated 5 years ago
- ☆14Sep 22, 2016Updated 9 years ago
- Toolbox for IBP Coupled SPCM-CRP Hidden Markov Model. Also contains code for EM-based HMM learning and inference for Bayesian non-paramet…☆14Mar 21, 2019Updated 7 years ago
- An entity linking prototype, developed using the datasets from the TAC-KBP sub-task.☆27Apr 5, 2017Updated 9 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- It is a Chrome extension, an alternative to ChatGPT. It is free and no data leaves your computer. Powered by WebLLM.☆16Mar 4, 2024Updated 2 years ago
- Process Common Crawl data with Python and Spark☆455Mar 26, 2026Updated 2 months ago
- A static file containing a list of popular RSS feeds.☆13Aug 25, 2016Updated 9 years ago
- this is a Manual Named-Entities/Part-of-speech Tagger for Spacy, You can use it to create your own training datasets.☆12Jun 16, 2018Updated 7 years ago
- Smart Glasses for Police Force, a wearable augmented reality glasses with applications in security, medical and industrial field applicat…☆21Mar 20, 2018Updated 8 years ago
- This file maps a given list of company names to their proper website and also maps a give list of websites to the company name.☆15Nov 16, 2018Updated 7 years ago
- Text analysis for automatic bookmarking/keyword extraction☆18Nov 20, 2016Updated 9 years ago
- subdomain list based on Common Crawl data, sorted by popularity☆18Nov 19, 2019Updated 6 years ago
- EmbedRank implemented in Python.☆15Jun 17, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- European Parliament website Python scraper☆12Oct 19, 2016Updated 9 years ago
- Convert powerpoint (pptx) files into raw text org or LaTeX files☆15Aug 28, 2018Updated 7 years ago
- 🐱💻A key-stroke logging application for windows, also capable of capturing mouse window clicks and send event logs to a remote server☆14Jul 2, 2021Updated 4 years ago
- Extract images from PowerPoint files☆17Dec 1, 2011Updated 14 years ago
- Scripts for building a geo-located web corpus using Common Crawl data☆11Jan 18, 2026Updated 4 months ago
- Automated generation of powerpoint slides for fun and profit☆13Oct 18, 2017Updated 8 years ago
- seq2seq based keyphrase generation model sets, including copyrnn copycnn and copytransfomer☆50Feb 7, 2022Updated 4 years ago