Shivanshu-Gupta / web-scrapersLinks
A repository of my web-scraping projects
☆33Updated 11 months ago
Alternatives and similar repositories for web-scrapers
Users that are interested in web-scrapers are comparing it to the libraries listed below
Sorting:
- BERT Probe: A python package for probing attention based robustness to character and word based adversarial evaluation. Also, with recipe…☆18Updated 3 years ago
- Sentence tokenizer for clinical/medical text.☆28Updated last year
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated 2 years ago
- Using PubMed to find out how a gene contributes to addiction.☆20Updated 2 years ago
- MozoLM: A language model (LM) serving library☆45Updated 2 weeks ago
- Transforming textual descriptions into process models using deep learning☆15Updated 6 years ago
- [WIP] Behold, semantic-search, built over sentence-transformers to make it easy for search engineers to evaluate, optimise and deploy mod…☆15Updated 2 years ago
- Utility for cui2vec in Go☆13Updated 2 years ago
- Healthsea is a spaCy pipeline for analyzing user reviews of supplementary products for their effects on health.☆92Updated 3 years ago
- A large scale Humor Dataset, containing more than 550k rated English jokes (LREC'20)☆67Updated 2 years ago
- A curated list of ML awesome frameworks & libraries for text data☆16Updated 2 years ago
- Visualization Tool for Mapping Out Researchers using Natural Language Processing☆57Updated last year
- Fastlaw's purpose is to replace generic word embeddings for work on supervised machine learning NLP-tasks with legal texts.☆39Updated 6 years ago
- Clean personally identifiable information from dirty dirty text using spaCy.☆41Updated 2 years ago
- Finds linguistic patterns effortlessly☆38Updated 2 years ago
- Natural Language Generation for Gramex applications.☆25Updated 3 years ago
- Intelligence Task Ontology (ITO)☆74Updated 3 years ago
- Document level Attitude and Relation Extraction toolkit (AREkit) for sampling and processing large text collections with ML and for ML☆63Updated 8 months ago
- An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.☆86Updated 4 years ago
- A collection of textual datasets in Hausa language and the corresponding translation in English language.☆16Updated 4 years ago
- Custom Natural Language Processing with big and small models 🌲🌱☆67Updated 4 years ago
- A PyPI package for easy text annotation in a Jupyter Notebook.☆28Updated 4 years ago
- Scripts to parse arxiv documents for NLP tasks☆18Updated 2 years ago
- A collection of utilities for writing labeling functions, transformation functions, and slicing functions.☆22Updated 5 years ago
- A utility for labeling clusters of text data.☆28Updated 4 years ago
- Finds out symptoms similar to a given symptom, from a symptom-disease data set.☆51Updated 7 years ago
- A corpus of textual data corresponding to synthetic clinical encounters, including each encounters’ dialogue transcript and clinical note…☆39Updated 2 years ago
- Supervised instruction finetuning for LLM with HF trainer and Deepspeed☆36Updated 2 years ago
- classy is a simple-to-use library for building high-performance Machine Learning models in NLP.☆87Updated last week
- Replication materials for "Identifying the Development and Application of Artificial Intelligence in Scientific Text"☆12Updated 5 years ago