fhoffa / analyzing_githubLinks
Analyzing GitHub with BigQuery and other tools
☆195Updated 5 years ago
Alternatives and similar repositories for analyzing_github
Users that are interested in analyzing_github are comparing it to the libraries listed below
Sorting:
- Predict code bug risk with git metadata☆42Updated 5 years ago
- The code processes URLs in an attempt to consolidate different web addresses that point to the same URL and to remove potentially private…☆23Updated 3 years ago
- The GHtorrent project website☆157Updated last year
- Scripts to mirror Github in a cloudy fashion☆567Updated last year
- Send Sir Perceval on a quest to retrieve and gather data from software repositories.☆303Updated last week
- A command line tool to cluster html pages based on structural and style similarity.☆20Updated last month
- A scraper focused on organizational Github accounts and their members.☆42Updated 3 years ago
- ☆78Updated 2 years ago
- CLK hash: hash pii for entity matching☆47Updated 3 months ago
- An analysis of all 1.3 million public Jupyter Notebooks on Github in July 2017☆72Updated 7 years ago
- Crawl GitHub APIs and store the discovered orgs, repos, commits, ...☆388Updated 4 years ago
- Scraping Assisted by Learning☆35Updated 3 weeks ago
- Experiments to help discussion on Wikipedia talk pages☆66Updated last month
- 💨🥫 A Data Factory system for running data processing pipelines built on AirFlow and tailored to CKAN. Includes evolution of DataPusher …☆32Updated last month
- Clean personally identifiable information from dirty dirty text.☆414Updated 2 years ago
- A simple command line interface to the datamade/dedupe library.☆42Updated 2 years ago
- Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasets…☆46Updated 4 years ago
- Data validation as a service. Project retired, got to the current one at frictionsless/repository☆69Updated 2 years ago
- Scraping Tweet data for Russian Troll Twitter accounts into Neo4j☆57Updated 7 years ago
- Inspect a URL and estimate if it contains a news story☆39Updated 9 months ago
- Ontology dataset for open_numbers namespace☆10Updated 9 months ago
- Open Data 500☆23Updated 7 years ago
- Calculate the score of a repository based on best engineering practices.☆111Updated 4 years ago
- Now included in rigour☆151Updated 3 weeks ago
- A helper library full of URL-related heuristics.☆70Updated 2 months ago
- Embeddable dataviz. Like emoji, but charts. Tiny adorable little charts.☆22Updated 5 years ago
- A package to build an end-to-end pipeline for detecting personally identifiable information from text.☆45Updated 6 years ago
- Stream Twitter Data into BigQuery with Cloud Dataprep☆22Updated 2 weeks ago
- Perspectives on Data Science for Software Engineering☆61Updated 2 years ago
- Ideas for (tech) stuff to research, build or work on.☆50Updated 7 months ago