fedelopez77 / langdetectLinks
A language detection software
β58Updated 7 years ago
Alternatives and similar repositories for langdetect
Users that are interested in langdetect are comparing it to the libraries listed below
Sorting:
- β57Updated last year
- π¬ Language Identification with Support for More Than 2000 Labels -- EMNLP 2023β162Updated 4 months ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.β61Updated last year
- Legal document similarity - Code, data, and models for the ICAIL 2021 paper "Evaluating Document Representations for Content-based Legal β¦β32Updated 4 years ago
- Efficient few-shot learning with cross-encoders.β59Updated last year
- 80x faster and 95% accurate language identification with Fasttextβ162Updated last year
- multimodal document analysisβ167Updated last year
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.β109Updated last year
- Python API for https://vespa.ai, the open big data serving engineβ144Updated this week
- Datasets collection and preprocessings framework for NLP extreme multitask learningβ187Updated 3 months ago
- Easy modernBERT fine-tuning and multi-task learningβ61Updated 3 months ago
- The corresponding code for our paper: "Exploring the Challenges of Open Domain Multi-Document Summarization". Do not hesitate to open an β¦β32Updated 2 years ago
- Simply, faster, sentence-transformersβ143Updated last year
- Model implementation for the contextual embeddings projectβ36Updated 4 months ago
- β52Updated last year
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Searchβ98Updated 10 months ago
- This project studies the performance and robustness of language models and task-adaptation methods.β153Updated last year
- No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrievalβ29Updated 3 years ago
- The data and the PyTorch implementation for the models and experiments in the paper "Exploiting Asymmetry for Synthetic Training Data Genβ¦β64Updated 2 years ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)β75Updated 11 months ago
- Trully flash implementation of DeBERTa disentangled attention mechanism.β66Updated last week
- Plug-and-play Search Interfaces with Pyserini and Hugging Faceβ32Updated 2 years ago
- Generalist and Lightweight Model for Text Classificationβ163Updated 3 months ago
- Tools for managing datasets for governance and training.β85Updated 2 weeks ago
- Official implementation of the paper "CoEdIT: Text Editing by Task-Specific Instruction Tuning" (EMNLP 2023)β130Updated last year
- Starbucks: Improved Training for 2D Matryoshka Embeddingsβ22Updated 3 months ago
- β79Updated last year
- A Python Search Engine for Humans π₯Έβ237Updated last year
- [ACL 2023] Few-shot Reranking for Multi-hop QA via Language Model Promptingβ27Updated 2 years ago
- Source code and data for Like a Good Nearest Neighborβ30Updated 9 months ago