masakhane-io / masakhanePreprocessorLinks
Building an effective preprocessing tool for African languages
☆13Updated last year
Alternatives and similar repositories for masakhanePreprocessor
Users that are interested in masakhanePreprocessor are comparing it to the libraries listed below
Sorting:
- Data, Embeddings, Stopword lists, code, and baselines for COLING 2020 paper titled "KINNEWS and KIRNEWS: Benchmarking Cross-Lingual Text …☆13Updated last year
- MasakhaNEWS: News Topic Classification for African Languages☆23Updated last year
- Crosslingual Question Answering for African Languages☆31Updated 9 months ago
- AfriBERTa: Exploring the Viability of Pretrained Multilingual Language Models for Low-resourced Languages☆74Updated 3 years ago
- A simple library for segmenting legal texts☆17Updated 2 years ago
- Text simplification for a better world: Deep-Martin Transformer 🤗☆22Updated last year
- This is a repository for NaijaSenti. A Lacuna Funded Project for the development of sentiment corpus for four Nigerian languages: Igbo, H…☆32Updated last year
- MAFAND-MT☆57Updated last year
- List of all the resources I developed in collaboration with LSV and Masakhane during my doctoral studies and beyond☆12Updated 2 years ago
- ☆17Updated 4 years ago
- Mining Legal Arguments in Court Decisions - Data and software☆68Updated 2 years ago
- ☆23Updated 2 years ago
- Abstractive and Extractive Text summarization using Transformers.☆84Updated 2 years ago
- Collection of Datasets for Legal Text Processing☆111Updated this week
- Shoonya - Platform to Annotate and label data at scale.☆56Updated 10 months ago
- A dataset for pretraining language models targeted for legal tasks.☆134Updated 3 years ago
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.☆79Updated last year
- Semantic Segmentation of Legal texts that labels sentences with one of 7 rhetorical roles.☆73Updated last year
- GPTNERMED is a language model-generated, synthetic dataset and an open neural NER model for medical entities designed for German data.☆16Updated last year
- A (smart) rule based NLP module to extract job skills from text☆187Updated last year
- Code repository for "Introducing Airavata: Hindi Instruction-tuned LLM"☆59Updated 8 months ago
- A personal knowledge base that I can dump information to and help me learn☆24Updated last month
- Learning PyTorch through the D2L book. A series of notebooks for the same☆27Updated 3 years ago
- POS for African languages☆17Updated 3 weeks ago
- ☆123Updated last week
- ☆12Updated 9 months ago
- ☆17Updated 2 years ago
- Easy to use and understand multiple-choice question generation algorithm using T5 Transformers.☆135Updated 3 years ago
- Python intefrace for evaluation on chatgpt models☆19Updated last year
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated last year