anvaka / ghindex
Creates github index for similar repositories discovery
☆192Updated 8 years ago
Related projects: ⓘ
- Session Buddy Chrome Extension tool☆51Updated 9 years ago
- Vecino is a command line application to discover Git repositories which are similar to the one that the user provides.☆46Updated 5 years ago
- A company/project name generator for Python. Uses NLTK and diverse techniques derived from existing corporate etymologies and naming agen…☆48Updated 7 years ago
- Chrome extension: Gives Ctrl+F like find results which include non-exact (fuzzy) matches using string edit-distance and GloVe/Word2Vec. A…☆136Updated 4 years ago
- An interactive map of Stack Exchange tags for all sites.☆125Updated last year
- Scraping assistant tool. Editing and maintaining CSS/XPath selectors across webpages.☆96Updated 6 years ago
- Google Chrome Extension. Record All Browsing in Screenshots & Full Text. Search For Anything At Any Time. Never Forget Where You Read Som…☆302Updated 6 years ago
- Language Lego☆142Updated 4 years ago
- A Python canonicalizer to disambiguate and recognize known names from a poor quality data entry list.☆20Updated 8 years ago
- Unofficial Python wrapper for official Hacker News API☆164Updated 5 months ago
- Skinfer is a tool for inferring and merging JSON schemas☆141Updated 4 months ago
- Index URLs in Common Crawl☆192Updated 7 years ago
- Notetaking Electron app that can answer your questions and makes summaries for you☆90Updated last year
- A project to attempt to automatically login to a website given a single seed☆122Updated 2 years ago
- A python library detect and extract listing data from HTML page.☆109Updated 7 years ago
- Tag Pocket articles based on the time required to read them.☆44Updated 3 years ago
- Formasaurus tells you the type of an HTML form and its fields using machine learning☆116Updated 3 months ago
- NER toolkit for HTML data☆256Updated 4 months ago
- Easy extraction of keywords and engines from search engine results pages (SERPs).☆90Updated 2 years ago
- Paginating the web☆37Updated 10 years ago
- Aviation grade news article metadata extraction☆36Updated last year
- It finds best synonyms from Google Books when you press a hotkey☆30Updated 9 years ago
- Demonstration of using Python to process the Common Crawl dataset with the mrjob framework☆166Updated 2 years ago
- Suite of tools for detecting changes in web pages and their rendering☆53Updated 9 months ago
- Advanced similarity and duplicate source code proof of concept for our research efforts.☆52Updated 2 years ago
- ☆77Updated this week
- Automatic Web Article Summarizer☆412Updated 3 years ago
- A Python utility for moving bookmarks/reading lists between services☆199Updated 8 years ago
- Scrapy middleware which allows to crawl only new content☆79Updated last year
- CoCrawler is a versatile web crawler built using modern tools and concurrency.☆183Updated 2 years ago