zafercavdar / fasttext-langdetectLinks
80x faster and 95% accurate language identification with Fasttext
β164Updated last year
Alternatives and similar repositories for fasttext-langdetect
Users that are interested in fasttext-langdetect are comparing it to the libraries listed below
Sorting:
- Simply, faster, sentence-transformersβ143Updated last year
- A Python Search Engine for Humans π₯Έβ243Updated last month
- β‘οΈ 80x faster Fasttext language detection out of the box | Split text by languageβ283Updated 4 months ago
- FastFit β‘ When LLMs are Unfit Use FastFit β‘ Fast and Effective Text Classification with Many Classesβ212Updated 4 months ago
- Efficient few-shot learning with cross-encoders.β61Updated last year
- Python API for https://vespa.ai, the open big data serving engineβ154Updated this week
- π¬ Language Identification with Support for More Than 2000 Labels -- EMNLP 2023β186Updated 2 months ago
- Python3 bindings for the Compact Language Detector v3 (CLD3)β155Updated 2 years ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.β256Updated 3 years ago
- PyTorch-IE: State-of-the-art Information Extraction in PyTorchβ77Updated 3 months ago
- multimodal document analysisβ166Updated 2 months ago
- Generalist and Lightweight Model for Text Classificationβ167Updated last week
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiencyβ182Updated 7 months ago
- β372Updated 2 years ago
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.β81Updated 2 years ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.β111Updated last year
- π’ Work with static vector modelsβ36Updated 9 months ago
- The pipeline for the OSCAR corpusβ175Updated 2 months ago
- A multilingual version of MS MARCO passage ranking datasetβ146Updated 2 years ago
- Faster, modernized fork of the language identification tool langid.pyβ60Updated last year
- Completion After Prompt Probability. Make your LLM make a choiceβ82Updated last year
- Augmenty is an augmentation library based on spaCy for augmenting texts.β156Updated last year
- Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further langβ¦β133Updated last year
- Guideline following Large Language Model for Information Extractionβ424Updated last year
- β119Updated last year
- Repository for the paper "MultiNERD: A Multilingual, Multi-Genre and Fine-Grained Dataset for Named Entity Recognition (and Disambiguatioβ¦β45Updated last year
- Zero and Few shot named entity & relationships recognitionβ399Updated 4 months ago
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2β¦β70Updated 2 years ago
- RaKUn 2.0 - A fast keyword detection algorithmβ69Updated 5 months ago
- A large-scale multilingual dataset for Information Retrieval. Thorough human-annotations across 18 diverse languages.β197Updated last year