philipperemy/name-dataset

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/philipperemy/name-dataset)

philipperemy / name-dataset

The Python library for names.

☆1,010

Alternatives and similar repositories for name-dataset

Users that are interested in name-dataset are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

google-research-datasets / uninum
View on GitHub
A database of number names for 186 languages, locales, and scripts
☆67Mar 3, 2023Updated 3 years ago
opensanctions / qarin
View on GitHub
How can we improve name matching in screening tools?
☆17Aug 13, 2025Updated 11 months ago
mikex86 / DeepSpeech-Java-Bindings
View on GitHub
Java Bindings for the C++ library DeepSpeech
☆10Jun 4, 2020Updated 6 years ago
vadimkantorov / inferspeech
View on GitHub
PyTorch speech2text inference script for the NVidia openseq2seq wav2letter model variant
☆10Aug 12, 2019Updated 6 years ago
philipperemy / keras-snail-attention
View on GitHub
SNAIL Attention Block for Keras.
☆17Mar 30, 2020Updated 6 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Idlak / Living-Audio-Dataset
View on GitHub
A "Crowd-Built" continuously growing speech dataset with transcripts. The dataset contains multiple languages and is intended for anyone …
☆43Aug 3, 2022Updated 3 years ago
openownership / register
View on GitHub
A demonstration transnational register of beneficial ownership data from the UK, Denmark, Slovakia and Armenia
☆19Oct 30, 2024Updated last year
steveash / jg2p
View on GitHub
Grapheme to phoneme toolkit using joint-modelling + CRFs in java
☆15Jul 14, 2018Updated 8 years ago
carltonnorthern / nicknames
View on GitHub
A CSV file with US given names (first name) and their associated nicknames or diminutive names.
☆324Jun 1, 2026Updated last month
datamade / probablepeople
View on GitHub
a python library for parsing unstructured western names into name components.
☆622May 15, 2025Updated last year
revdotcom / words2num
View on GitHub
Convert words to numbers
☆21Apr 13, 2022Updated 4 years ago
GRAAL-Research / deepparse
View on GitHub
Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning
☆349Jul 2, 2026Updated 3 weeks ago
maxmelnick / spark-graph-er
View on GitHub
☆15May 19, 2019Updated 7 years ago
opensanctions / datapatch
View on GitHub
A Python library for defining rule-based overrides on messy data
☆18Nov 24, 2025Updated 8 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
keredson / wordninja
View on GitHub
Probabilistically split concatenated words using NLP based on English Wikipedia unigram frequencies.
☆874Feb 19, 2023Updated 3 years ago
idiap / inv-tn
View on GitHub
A bunch of scripts exploiting several tools to perform inverse text normalization (ITN)
☆21Sep 27, 2017Updated 8 years ago
EMRAI / emrai-synthetic-diarization-corpus
View on GitHub
☆22Sep 24, 2018Updated 7 years ago
flairNLP / flair
View on GitHub
A very simple framework for state-of-the-art Natural Language Processing (NLP)
☆14,382Oct 27, 2025Updated 8 months ago
JarrodAJ / sec_employee_information_extraction
View on GitHub
NSS Capstone project to use natural language modeling, classification, and information extraction to get the exact employee count values …
☆15Aug 20, 2018Updated 7 years ago
mammothb / symspellpy
View on GitHub
Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithm…
☆877Updated this week
pudo / prefixdate
View on GitHub
Provide partial dates and retain the date precision through processing
☆14Aug 4, 2025Updated 11 months ago
desh2608 / kaldi-noise-vectors
View on GitHub
Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.
☆13Feb 13, 2021Updated 5 years ago
namsor / namsor-python-tools-v2
View on GitHub
NamSor Python command line tools, to append gender, origin, diaspora or us 'race'/ethnicity to a CSV file.
☆15Sep 6, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
bradhackinen / nama
View on GitHub
Fast, flexible name matching for large datasets
☆71Aug 29, 2025Updated 10 months ago
charlesliucn / LanMIT
View on GitHub
📖 LanMIT: A Toolkit for Improving Language Models in Low-resourced Speech Recognition based on Kaldi.
☆22Jul 12, 2019Updated 7 years ago
neuspell / neuspell
View on GitHub
NeuSpell: A Neural Spelling Correction Toolkit
☆713Jul 31, 2023Updated 2 years ago
AppleHolic / audioset_augmentor
View on GitHub
Sound augmentation using Large-scale audio dataset (Audioset)
☆45Jun 29, 2021Updated 5 years ago
nreimers / truecaser
View on GitHub
Language independent truecaser in Python.
☆160Oct 17, 2021Updated 4 years ago
drgriffis / PyUMLS
View on GitHub
Python wrapper for UMLS REST API
☆10Dec 17, 2018Updated 7 years ago
mayhewsw / pytorch-truecaser
View on GitHub
A simple neural truecaser written in pytorch and allennlp.
☆35Jun 17, 2024Updated 2 years ago
dedupeio / dedupe
View on GitHub
A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
☆4,487Jul 29, 2025Updated 11 months ago
gpu-poor / gramvaani_hindi_asr
View on GitHub
This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge
☆16Mar 26, 2022Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
snakers4 / gpu-box-setup
View on GitHub
☆21Jul 28, 2020Updated 5 years ago
slackapi / workshop-pycon-2019
View on GitHub
PyCon Slack workshop
☆12Jun 7, 2019Updated 7 years ago
juand-r / entity-recognition-datasets
View on GitHub
A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of lang…
☆1,574Jul 2, 2026Updated 3 weeks ago
artbataev / end2end
View on GitHub
Losses and decoders for end-to-end ASR and OCR
☆34Oct 30, 2020Updated 5 years ago
openspending / community.openspending.org
View on GitHub
OpenSpending Community Site
☆16Apr 14, 2023Updated 3 years ago
moj-analytical-services / splink
View on GitHub
Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends
☆2,284Updated this week
steveash / NETransliteration-COLING2018
View on GitHub
Code and data used in named entity transliteration experiments
☆56Jun 4, 2018Updated 8 years ago