qburst / common-crawl-malayalamLinks
Useful tools to extract malayalam text from the Common Crawl Datasets
☆28Updated last year
Alternatives and similar repositories for common-crawl-malayalam
Users that are interested in common-crawl-malayalam are comparing it to the libraries listed below
Sorting:
- semantically distinct key phrase extraction using hilbert hashes.☆50Updated 3 years ago
- NeatText a simple NLP package for cleaning textual data and text preprocessing☆74Updated 2 years ago
- Topic Inference with Zeroshot models☆61Updated 2 years ago
- A curated list of ML awesome frameworks & libraries for text data☆16Updated 2 years ago
- Semantic search through a vectorized Wikipedia (SentenceBERT) with the Weaviate vector search engine☆244Updated 2 years ago
- Language detection using Spacy and Fasttext☆57Updated 2 years ago
- Automatically check mismatch between code and comments using AI and ML☆54Updated 4 years ago
- Information extraction from English and German texts based on predicate logic☆140Updated 2 years ago
- Framework for building and maintaining self-updating prompts for LLMs☆65Updated last year
- Finds linguistic patterns effortlessly☆39Updated 2 years ago
- A comprehensive tool for linguistic analysis of communities☆49Updated 4 years ago
- Custom Natural Language Processing with big and small models 🌲🌱☆66Updated 4 years ago
- Fastlaw's purpose is to replace generic word embeddings for work on supervised machine learning NLP-tasks with legal texts.☆40Updated 6 years ago
- Healthsea is a spaCy pipeline for analyzing user reviews of supplementary products for their effects on health.☆92Updated 4 years ago
- [WIP] Behold, semantic-search, built over sentence-transformers to make it easy for search engineers to evaluate, optimise and deploy mod…☆15Updated 2 years ago
- Clean personally identifiable information from dirty dirty text using spaCy.☆41Updated 2 years ago
- ☆43Updated 2 years ago
- Natural Language Generation for Gramex applications.☆25Updated 3 years ago
- No Teacher BART distillation experiment for NLI tasks☆28Updated 5 years ago
- Question Generation - Question Answering for Automatic Flashcards☆66Updated 3 years ago
- 💫 SpaCy wrapper for ConceptNet 💫☆95Updated last week
- classy is a simple-to-use library for building high-performance Machine Learning models in NLP.☆87Updated last week
- spaCy match and replace, maintaining conjugation☆36Updated 3 years ago
- Alternate Implementation for Zero Shot Text Classification: Instead of reframing NLI/XNLI, this reframes the text backbone of CLIP models…☆37Updated 3 years ago
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated last year
- Expose a Top2Vec model with a REST API.☆92Updated 3 years ago
- 🚀GUI for training spaCy models☆55Updated 4 years ago
- ☆44Updated 2 years ago
- Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.☆120Updated 2 months ago
- 🧬 A JupyterLab extension for annotating data with Prodigy☆189Updated 2 years ago