AzBuki-ML / public-dataLinks
Custom-built Bulgarian language data sets, used by АзБуки.ML for sentiment analysis, text classification, summarisation and generation. Open-source & free to use in any ML project.
☆19Updated 2 years ago
Alternatives and similar repositories for public-data
Users that are interested in public-data are comparing it to the libraries listed below
Sorting:
- A curated list of NLP resources for Hungarian☆268Updated 2 weeks ago
- CLASSLA Fork of the Official Stanford NLP Python Library for Many Human Languages☆46Updated 9 months ago
- ☆28Updated 3 weeks ago
- Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German☆515Updated last year
- ☆33Updated last year
- The robust European language model benchmark.☆159Updated this week
- Pre-trained models and language resources for Natural Language Processing in Polish☆368Updated last year
- Norwegian Transformer Model☆116Updated 3 weeks ago
- ☆36Updated 2 years ago
- The broad index of NLP resources for Eastern European languages. The best EEML 2021 project.☆19Updated 3 years ago
- Курс по машинному обучению для магистров компьютерной лингвистики 1-го курса в Высшей Школе Экономики☆16Updated 5 years ago
- Compound splitter for German language ("Komposita-Zerlegung") based on large dictionary combined with highly efficient multi-pattern stri…☆34Updated 3 years ago
- Curated list of Ukrainian natural language processing (NLP) resources (corpora, pretrained models, libriaries, etc.)☆227Updated 3 months ago
- Natural language understanding benchmarks for Norwegian☆14Updated 5 months ago
- RoBERTa models for Polish☆89Updated 3 years ago
- Open German WordNet☆100Updated 3 weeks ago
- A Python library for efficient and flexible cycle-consistency training of transformer models via iteratie back-translation. Memory and co…☆11Updated last year
- Russian Corpus of Linguistic Acceptability☆47Updated last year
- Apertium linguistic data for Kyrgyz☆17Updated 7 months ago
- A Python Wiktionary Parser☆371Updated 6 months ago
- Lingtrain Aligner — ML powered library for the accurate texts alignment.☆147Updated 7 months ago
- A list of initiatives for adding new languages to opensource machine translation models☆21Updated 2 months ago
- Здесь собирается ка талог ссылок на полезные языковые ресурсы башкирского языка☆14Updated last year
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆184Updated 8 months ago
- Linguistic Reconstruction with LingPy☆15Updated last year
- All languages stopwords collection☆476Updated 2 years ago
- Code to create the dataset from "A New Aligned Simple German Corpus☆11Updated 2 years ago
- Extension for pie to include taggers with their models and pre/postprocessors☆11Updated last year
- Romanian Named Entity Corpus (RONEC) version 2.0☆66Updated 3 years ago
- A curated list of resources such as tools and datasets useful for the processing of Slovak language☆22Updated 2 weeks ago