qburst / common-crawl-malayalamLinks
Useful tools to extract malayalam text from the Common Crawl Datasets
☆28Updated 8 months ago
Alternatives and similar repositories for common-crawl-malayalam
Users that are interested in common-crawl-malayalam are comparing it to the libraries listed below
Sorting:
- semantically distinct key phrase extraction using hilbert hashes.☆50Updated 3 years ago
- Topic Inference with Zeroshot models☆61Updated 2 years ago
- A utility for labeling clusters of text data.☆28Updated 3 years ago
- Semantic search through a vectorized Wikipedia (SentenceBERT) with the Weaviate vector search engine☆242Updated 2 years ago
- Automatically check mismatch between code and comments using AI and ML☆53Updated 4 years ago
- NeatText a simple NLP package for cleaning textual data and text preprocessing☆72Updated last year
- [WIP] Behold, semantic-search, built over sentence-transformers to make it easy for search engineers to evaluate, optimise and deploy mod…☆15Updated 2 years ago
- Healthsea is a spaCy pipeline for analyzing user reviews of supplementary products for their effects on health.☆92Updated 3 years ago
- Conversational text Analysis using various NLP techniques☆180Updated 2 years ago
- A comprehensive tool for linguistic analysis of communities☆49Updated 3 years ago
- Natural Language Generation for Gramex applications.☆25Updated 3 years ago
- ☆43Updated 2 years ago
- Pyinfer is a model agnostic tool for ML developers and researchers to benchmark the inference statistics for machine learning models or f…☆24Updated 4 years ago
- Package that returns a company embedding given a company name☆46Updated 5 years ago
- NLP tool to extract emotional phrase from tweets 🤩☆40Updated 3 years ago
- Information extraction from English and German texts based on predicate logic☆138Updated 2 years ago
- Expose a Top2Vec model with a REST API.☆91Updated 2 years ago
- Framework for building and maintaining self-updating prompts for LLMs☆64Updated last year
- STriP Net: Semantic Similarity of Scientific Papers (S3P) Network☆86Updated 3 years ago
- ☆47Updated 2 years ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆61Updated last week
- Instant search for and access to many datasets in Pyspark.☆34Updated 2 years ago
- spaCy match and replace, maintaining conjugation☆35Updated 2 years ago
- Alternate Implementation for Zero Shot Text Classification: Instead of reframing NLI/XNLI, this reframes the text backbone of CLIP models…☆36Updated 3 years ago
- ☆13Updated 4 years ago
- Loan Risk Prediction Neural Network and API☆17Updated 4 years ago
- Language detection using Spacy and Fasttext☆57Updated last year
- Interactive tree-maps with SBERT & Hierarchical Clustering (HAC)☆30Updated 7 months ago
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated last year
- The ntentional blog - a machine learning journey☆23Updated 2 years ago