bsolomon1124 / pycld3View external linksLinks
Python3 bindings for the Compact Language Detector v3 (CLD3)
β155Jun 26, 2023Updated 2 years ago
Alternatives and similar repositories for pycld3
Users that are interested in pycld3 are comparing it to the libraries listed below
Sorting:
- β178Mar 28, 2025Updated 10 months ago
- πΈ GlotWeb: Web Indexing for Low-Resource Languages -- under construction.β17Aug 13, 2025Updated 6 months ago
- Port of Google's language-detection library to Python.β1,871Mar 3, 2025Updated 11 months ago
- β18Jan 26, 2023Updated 3 years ago
- Targetted language identifier, based on FastText and Hunspell.β38Sep 4, 2025Updated 5 months ago
- Language detection using Spacy and Fasttextβ57Dec 17, 2023Updated 2 years ago
- The most accurate natural language detection library for Python, suitable for short text and mixed-language textβ1,630Nov 21, 2025Updated 2 months ago
- Residual Quantization Autoencoder, used for interpreting LLMsβ14Jan 1, 2025Updated last year
- 80x faster and 95% accurate language identification with Fasttextβ164Jan 23, 2024Updated 2 years ago
- Accurately find/replace/remove emojis in text stringsβ163Dec 16, 2023Updated 2 years ago
- A spaCy wrapper of OpenTapioca for named entity linking on Wikidataβ96Feb 5, 2026Updated last week
- An example of graph embeddings for wikipedia page recommendationsβ11Aug 26, 2021Updated 4 years ago
- Stand-alone language identification systemβ2,452Jan 1, 2020Updated 6 years ago
- β11Dec 25, 2020Updated 5 years ago
- β10Jun 9, 2022Updated 3 years ago
- Code for COLING 2020 paper "Improving Document-level Sentiment Analysis with User and Product Context"β11Apr 13, 2022Updated 3 years ago
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.β35Feb 5, 2026Updated last week
- Convert CSV files into Markdown formatted tablesβ17Apr 20, 2021Updated 4 years ago
- Python library for converting HTML to markup or plain textβ16Aug 30, 2025Updated 5 months ago
- π Supporting UN Sustainable Development Goal (SDG) #2: Zero Hunger.β23May 20, 2022Updated 3 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality β¦β106Feb 26, 2024Updated last year
- π₯ Make peer-2-peer global worksβ46Jan 29, 2026Updated 2 weeks ago
- A Python library for calculating a large variety of metrics from textβ359Jan 30, 2026Updated 2 weeks ago
- A Knowledge Base for research software relying on large-scale text mining and curated knowledge sourcesβ16May 14, 2023Updated 2 years ago
- π Resource and Tool for Writing System Identification -- LREC 2024β21Dec 29, 2025Updated last month
- speaker-disentangled speech linguistic content quantizerβ24Mar 19, 2025Updated 10 months ago
- A Python utility for indexing file lines. Best demo honourable mention at ECIR 2024.β23Nov 9, 2025Updated 3 months ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiencyβ185Jun 6, 2025Updated 8 months ago
- Multilingual text (NLP) processing toolkitβ2,361Nov 10, 2023Updated 2 years ago
- EMNLP2022 "Cross-Align: Modeling Deep Cross-lingual Interactions for Word Alignment"β19Feb 19, 2023Updated 2 years ago
- Data Collection System For NLP/Speech Recognitionβ25Apr 20, 2021Updated 4 years ago
- semantically distinct key phrase extraction using hilbert hashes.β51Feb 28, 2022Updated 3 years ago
- A fully customisable language detection pipeline for spaCyβ93May 2, 2019Updated 6 years ago
- DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplateβ¦β52Jun 12, 2020Updated 5 years ago
- Language-Agnostic SEntence Representationsβ3,658May 2, 2024Updated last year
- β30Jun 23, 2022Updated 3 years ago
- Python port of Moses tokenizer, truecaser and normalizerβ495Feb 6, 2026Updated last week
- Machine Learning models using a Bayesian approach and often PyMC3β25Jan 21, 2021Updated 5 years ago
- This tool allows local LLM usage that can automate tasks without human interventention. The agent can call itself recursively and work onβ¦β20May 5, 2025Updated 9 months ago