qburst / common-crawl-malayalamLinks
Useful tools to extract malayalam text from the Common Crawl Datasets
☆28Updated 10 months ago
Alternatives and similar repositories for common-crawl-malayalam
Users that are interested in common-crawl-malayalam are comparing it to the libraries listed below
Sorting:
- semantically distinct key phrase extraction using hilbert hashes.☆50Updated 3 years ago
- A comprehensive tool for linguistic analysis of communities☆49Updated 4 years ago
- Topic Inference with Zeroshot models☆61Updated 2 years ago
- Automatically check mismatch between code and comments using AI and ML☆53Updated 4 years ago
- Healthsea is a spaCy pipeline for analyzing user reviews of supplementary products for their effects on health.☆92Updated 3 years ago
- NeatText a simple NLP package for cleaning textual data and text preprocessing☆72Updated last year
- Information extraction from English and German texts based on predicate logic☆139Updated 2 years ago
- Semantic search through a vectorized Wikipedia (SentenceBERT) with the Weaviate vector search engine☆243Updated 2 years ago
- Framework for building and maintaining self-updating prompts for LLMs☆64Updated last year
- Expose a Top2Vec model with a REST API.☆92Updated 2 years ago
- Python library for feature selection for text features. It has filter method, genetic algorithm and TextFeatureSelectionEnsemble for impr…☆53Updated last year
- [WIP] Behold, semantic-search, built over sentence-transformers to make it easy for search engineers to evaluate, optimise and deploy mod…☆15Updated 2 years ago
- Conversational text Analysis using various NLP techniques☆182Updated 2 years ago
- A comprehensive reference for all topics related to building and maintaining microservices☆67Updated 2 years ago
- NLP tool to extract emotional phrase from tweets 🤩☆40Updated 4 years ago
- Alternate Implementation for Zero Shot Text Classification: Instead of reframing NLI/XNLI, this reframes the text backbone of CLIP models…☆37Updated 3 years ago
- ☆33Updated 6 years ago
- Question Generation - Question Answering for Automatic Flashcards☆66Updated 3 years ago
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.☆80Updated 2 years ago
- Custom Natural Language Processing with big and small models 🌲🌱☆67Updated 4 years ago
- Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.☆119Updated last year
- 🧬 A JupyterLab extension for annotating data with Prodigy☆189Updated 2 years ago
- Natural Language Generation for Gramex applications.☆25Updated 3 years ago
- Hashformers is a framework for hashtag segmentation with Transformers and Large Language Models (LLMs).☆74Updated last year
- Fastlaw's purpose is to replace generic word embeddings for work on supervised machine learning NLP-tasks with legal texts.☆39Updated 6 years ago
- Pyinfer is a model agnostic tool for ML developers and researchers to benchmark the inference statistics for machine learning models or f…☆24Updated 4 years ago
- Package that returns a company embedding given a company name☆47Updated 5 years ago
- 💫 SpaCy wrapper for ConceptNet 💫☆95Updated 2 years ago
- 🚀GUI for training spaCy models☆55Updated 4 years ago
- ☆43Updated 2 years ago