masakhane-io / masakhanePreprocessorLinks
Building an effective preprocessing tool for African languages
☆13Updated last year
Alternatives and similar repositories for masakhanePreprocessor
Users that are interested in masakhanePreprocessor are comparing it to the libraries listed below
Sorting:
- MasakhaNEWS: News Topic Classification for African Languages☆24Updated last year
- This is a repository for NaijaSenti. A Lacuna Funded Project for the development of sentiment corpus for four Nigerian languages: Igbo, H…☆35Updated 2 months ago
- MAFAND-MT☆60Updated last year
- Crosslingual Question Answering for African Languages☆30Updated last year
- Streamlit app to Translate text to or between 50 languages with mBART-50 from Huggingface and Facebook☆25Updated 4 years ago
- Transforming textual descriptions into process models using deep learning☆15Updated 6 years ago
- List of all the resources I developed in collaboration with LSV and Masakhane during my doctoral studies and beyond☆12Updated 3 years ago
- AfriBERTa: Exploring the Viability of Pretrained Multilingual Language Models for Low-resourced Languages☆80Updated 3 years ago
- Data, Embeddings, Stopword lists, code, and baselines for COLING 2020 paper titled "KINNEWS and KIRNEWS: Benchmarking Cross-Lingual Text …☆13Updated last year
- Text simplification for a better world: Deep-Martin Transformer 🤗☆22Updated 2 years ago
- Source codes for the paper "Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints"☆27Updated 2 years ago
- Repository containing awesome resources regarding Hugging Face tooling.☆48Updated 2 years ago
- A collection of textual datasets in Hausa language and the corresponding translation in English language.☆16Updated 4 years ago
- A large scale Humor Dataset, containing more than 550k rated English jokes (LREC'20)☆71Updated 2 years ago
- Using short models to classify long texts☆21Updated 2 years ago
- Python intefrace for evaluation on chatgpt models☆19Updated last year
- GPTNERMED is a language model-generated, synthetic dataset and an open neural NER model for medical entities designed for German data.☆16Updated 2 years ago
- Plug-and-play Search Interfaces with Pyserini and Hugging Face☆32Updated 2 years ago
- POS for African languages☆19Updated 6 months ago
- Hashformers is a framework for hashtag segmentation with Transformers and Large Language Models (LLMs).☆76Updated this week
- A curated list of materials on AI guardrails☆43Updated 7 months ago
- ☆23Updated 11 months ago
- A Streamlit app to extract keywords using KeyBert☆37Updated 4 years ago
- ☆12Updated last year
- 💙 Unstructured Data Connectors for Haystack 2.0☆17Updated 2 years ago
- 🤗 Push your spaCy pipelines to the Hugging Face Hub☆45Updated last year
- A Python package to get useful information from documents using TopicRank Algorithm.☆16Updated 2 years ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆72Updated last year
- Transformers for Clinical NLP☆26Updated 5 months ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆75Updated last year