fhoffa / analyzing_githubLinks
Analyzing GitHub with BigQuery and other tools
☆198Updated 5 years ago
Alternatives and similar repositories for analyzing_github
Users that are interested in analyzing_github are comparing it to the libraries listed below
Sorting:
- Advanced similarity and duplicate source code at scale.☆56Updated 6 years ago
- ☆31Updated 11 years ago
- The code processes URLs in an attempt to consolidate different web addresses that point to the same URL and to remove potentially private…☆23Updated 4 years ago
- A scraper focused on organizational Github accounts and their members.☆42Updated 2 weeks ago
- A tutorial on how to do GitHub research with GHTorrent http://ghtorrent.github.io/tutorial☆21Updated last year
- Train a model, and detect gibberish strings with it.☆67Updated 3 years ago
- Creates github index for similar repositories discovery☆192Updated 9 years ago
- Experiments to help discussion on Wikipedia talk pages☆68Updated 2 months ago
- Venmo trasaction dataset for data analysis/visualization/anything☆212Updated 5 years ago
- Clean personally identifiable information from dirty dirty text.☆415Updated 2 years ago
- A simple dataset of Stack Overflow questions and tags☆108Updated 8 years ago
- An analysis of all 1.3 million public Jupyter Notebooks on Github in July 2017☆72Updated 7 years ago
- The AI Incident Database seeks to identify, define, and catalog artificial intelligence incidents.☆213Updated last week
- A Singer tap for extracting data from the GitHub API☆74Updated 3 weeks ago
- ARCHIVED, replaced by https://github.com/pypa/linehaul-cloud-function/☆71Updated 4 years ago
- Crawl GitHub APIs and store the discovered orgs, repos, commits, ...☆388Updated 5 years ago
- Ontology dataset for open_numbers namespace☆10Updated 2 weeks ago
- Advanced similarity and duplicate source code proof of concept for our research efforts.☆52Updated 3 years ago
- The Data Linter identifies potential issues (lints) in your ML training data.☆89Updated 7 years ago
- A maximum-strength name parser for record linkage.☆39Updated 2 months ago
- Common Crawl Index Server☆71Updated 8 months ago
- Scraping Assisted by Learning☆35Updated 2 months ago
- sync a website or local spreadsheet with a google sheet☆35Updated 2 years ago
- Datasette plugin for publishing data using Vercel☆45Updated 3 years ago
- This project is created to promote and advocate the use of FOSS machine learning.☆47Updated 6 months ago
- Assessing Source Code Semantic Similarity with Unsupervised Learning☆41Updated 7 years ago
- Tracking the history of the FARA data from https://www.justice.gov/nsd-fara☆14Updated 2 years ago
- Send Sir Perceval on a quest to retrieve and gather data from software repositories.☆308Updated last week
- Clicks-Attention-Satisfaction Evaluation Model and Metric☆77Updated 9 years ago
- scrape messages from slack channels☆31Updated 6 years ago