zafercavdar / fasttext-langdetectLinks
80x faster and 95% accurate language identification with Fasttext
β160Updated last year
Alternatives and similar repositories for fasttext-langdetect
Users that are interested in fasttext-langdetect are comparing it to the libraries listed below
Sorting:
- Simply, faster, sentence-transformersβ143Updated 11 months ago
- π¬ Language Identification with Support for More Than 2000 Labels -- EMNLP 2023β147Updated 2 months ago
- β‘οΈ 80x faster Fasttext language detection out of the box | Split text by languageβ223Updated 4 months ago
- FastFit β‘ When LLMs are Unfit Use FastFit β‘ Fast and Effective Text Classification with Many Classesβ210Updated 3 months ago
- Efficient few-shot learning with cross-encoders.β56Updated last year
- A Python Search Engine for Humans π₯Έβ226Updated last year
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.β250Updated 2 years ago
- Python3 bindings for the Compact Language Detector v3 (CLD3)β154Updated 2 years ago
- β171Updated 4 months ago
- PyTorch-IE: State-of-the-art Information Extraction in PyTorchβ78Updated this week
- Python API for https://vespa.ai, the open big data serving engineβ133Updated this week
- Instruct LLMs for flat and nested NER. Fine-tuning Llama and Mistral models for instruction named entity recognition. (Instruction NER)β84Updated last year
- Incorporating VIsual LAyout Structures for Scientific Text Classificationβ179Updated 2 years ago
- The pipeline for the OSCAR corpusβ171Updated last year
- Training open neural machine translation modelsβ369Updated 4 months ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiencyβ168Updated 2 months ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.β156Updated last year
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.β80Updated last year
- β366Updated last year
- multimodal document analysisβ165Updated last year
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.β108Updated last year
- A curated list of awesome data annotation toolsβ213Updated 2 years ago
- π₯ Use Hugging Face text and token classification pipelines directly in spaCyβ63Updated last year
- Generalist and Lightweight Model for Text Classificationβ148Updated last month
- Targetted language identifier, based on FastText and Hunspell.β36Updated 5 months ago
- Notebooks for training universal 0-shot classifiers on many different tasksβ133Updated 7 months ago
- KeyPhraseTransformer lets you quickly extract key phrases, topics, themes from your text data with T5 transformer | Keyphrase extractionβ¦β104Updated last year
- Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, impβ¦β186Updated 11 months ago
- A multilingual version of MS MARCO passage ranking datasetβ144Updated last year
- Easy-Translate is a script for translating large text files with a SINGLE COMMAND. Easy-Translate is designed to be as easy as possible fβ¦β221Updated 8 months ago