fedelopez77 / langdetectLinks
A language detection software
☆58Updated 8 years ago
Alternatives and similar repositories for langdetect
Users that are interested in langdetect are comparing it to the libraries listed below
Sorting:
- 80x faster and 95% accurate language identification with Fasttext☆161Updated last year
- Efficient few-shot learning with cross-encoders.☆59Updated last year
- ☆58Updated last year
- Legal document similarity - Code, data, and models for the ICAIL 2021 paper "Evaluating Document Representations for Content-based Legal …☆32Updated 4 years ago
- Universal text classifier for generative models☆25Updated last year
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.☆110Updated last year
- 💬 Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆163Updated 4 months ago
- ☆62Updated last year
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆62Updated last year
- ☆50Updated 8 months ago
- Datasets collection and preprocessings framework for NLP extreme multitask learning☆188Updated 3 months ago
- ☆25Updated last year
- Python API for https://vespa.ai, the open big data serving engine☆146Updated this week
- BigTranslate: Augmenting Large Language Models with Multilingual Translation Capability over 100 Languages☆228Updated last year
- Seed Machine Translation Data☆33Updated 11 months ago
- Model implementation for the contextual embeddings project☆36Updated 5 months ago
- Simply, faster, sentence-transformers☆143Updated last year
- A Multilingual Replicable Instruction-Following Model☆95Updated 2 years ago
- ☆174Updated 7 months ago
- Plug-and-play Search Interfaces with Pyserini and Hugging Face☆32Updated 2 years ago
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆66Updated last month
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆99Updated 11 months ago
- Official implementation of the paper "CoEdIT: Text Editing by Task-Specific Instruction Tuning" (EMNLP 2023)☆132Updated last year
- CMU Linguistic Annotation Backend☆14Updated last month
- ☆86Updated 7 months ago
- PyLate efficient inference engine☆66Updated last month
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆49Updated last year
- A Python Search Engine for Humans 🥸☆237Updated last year
- LeXFiles and LegalLAMA: Facilitating English Multinational Legal Language Model Development☆19Updated 2 years ago
- multimodal document analysis☆166Updated last year