datasciencecampus / pygramsLinks

Extracts key terminology (n-grams) from any large collection of documents (>1000) and forecasts emergence

☆66

Alternatives and similar repositories for pygrams

Users that are interested in pygrams are comparing it to the libraries listed below

Sorting:

bradhackinen / nama
Fast, flexible name matching for large datasets
☆71Updated 2 months ago
crazyfrogspb / RedditScore
Package for performing Reddit-based text analysis
☆21Updated 6 years ago
PatentsView / PatentsProcessor
☆17Updated 7 years ago
dssg / pgdedupe
A simple command line interface to the datamade/dedupe library.
☆42Updated 2 years ago
JasonKessler / agefromname
Predict age and gender from a first name
☆59Updated 7 years ago
J535D165 / recordlinkage-annotator
A browser user interface for manual labeling of record pairs.
☆48Updated 2 years ago
jsoma / fuzzy_pandas
Fuzzy matches and merging of datasets in pandas using csvmatch
☆76Updated 5 years ago
harvard-lil / cap-examples
Examples for getting started using https://case.law
☆69Updated 3 years ago
cverluise / PatCit
Making Patent Citations Uncool Again
☆112Updated 2 years ago
openeventdata / Dictionaries
PETRARCH actor, agent and verb dictionaries
☆22Updated 7 years ago
openeventdata / phoenix_pipeline
Turning news into events since 2014.
☆51Updated 8 years ago
aeturrell / occupationcoder
Given a job title and job description, the algorithm assigns a standard occupational classification (SOC) code to the job.
☆74Updated last year
yash1994 / dframcy
Dataframe Integration with spaCy.
☆103Updated 4 years ago
jstray / deepform
Using ML to extract campaign finance data from messy forms for journalism
☆77Updated 3 years ago
openeventdata / petrarch2
Another next-generation event coding platform.
☆77Updated 6 years ago
cantabular / databaker
Command line tool to convert spreadsheets to databases, made for the UK's Office for National Statistics.
☆80Updated last year
lukewhyte / textpack
Group thousands of similar spreadsheet or database text entries in seconds
☆157Updated 2 years ago
uwdata / termite-data-server
Data Server for Topic Models
☆122Updated 2 years ago
usc-isi-i2 / rltk
Record Linkage ToolKit (Find and link entities)
☆110Updated 2 years ago
pmbaumgartner / streamlitopedia
Collection of code snippets and utilities for streamlit apps
☆22Updated 5 years ago
emilyinamillion / supreme-court-topics-overtime
☆46Updated 3 months ago
vaneseltine / nominally
A maximum-strength name parser for record linkage.
☆39Updated 2 months ago
appeler / namesexdata
Data on international first names and sex of people with that name
☆12Updated 6 years ago
jplusplus / statscraper
A base library for building web scrapers for statistical data, and a helper ontology for (primarily Swedish) statistical data.
☆14Updated 8 months ago
pewresearch / pewanalytics
Text and statistics utilities from Pew Research Center
☆85Updated 3 years ago
lmullen / genderdata
A data package for R containing historical datasets about gender
☆25Updated 3 years ago
walkerdb / supreme_court_transcripts
☆74Updated last week
openeventdata / PLOVER
Next generation event data ontology
☆76Updated last year
datawizard1337 / ARGUS
ARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of diff…
☆89Updated 3 years ago
verginer / disamby
Python package aiding in entity disambiguation based on string and location matching
☆18Updated 2 years ago