Shivanshu-Gupta / web-scrapers
A repository of my web-scraping projects
☆30Updated 6 months ago
Alternatives and similar repositories for web-scrapers
Users that are interested in web-scrapers are comparing it to the libraries listed below
Sorting:
- BERT Probe: A python package for probing attention based robustness to character and word based adversarial evaluation. Also, with recipe…☆18Updated 2 years ago
- NLG Best Practices for Data-Efficient Modeling How to Train Production-Ready Models with Little Data☆10Updated 3 years ago
- http://icd10data.com/ data scraping☆20Updated 6 years ago
- Generate True or False questions from any content with OpenAI GPT2 text generation, Sentence-BERT semantic search and Berkley constituenc…☆33Updated 5 years ago
- A collection of textual datasets in Hausa language and the corresponding translation in English language.☆15Updated 4 years ago
- Sentence tokenizer for clinical/medical text.☆26Updated 11 months ago
- [WIP] Behold, semantic-search, built over sentence-transformers to make it easy for search engineers to evaluate, optimise and deploy mod…☆15Updated 2 years ago
- Medical Relations and Entities Extraction☆36Updated 2 years ago
- Clinical NER with UMLS lookup☆22Updated 5 years ago
- Training a model without a dataset for natural language inference (NLI)☆25Updated 4 years ago
- Universal Dependencies (v1.0) for the GENIA 1.0 Treebank, along with additional raw abstracts and metadata.☆22Updated 5 years ago
- Huggingface inference with GPU Docker on AWS☆41Updated 3 years ago
- Fastlaw's purpose is to replace generic word embeddings for work on supervised machine learning NLP-tasks with legal texts.☆38Updated 5 years ago
- Implementation of Z-BERT-A: a zero-shot pipeline for unknown intent detection.☆39Updated last year
- Post-processing OCR errors with seq2seq models☆28Updated 4 years ago
- How do we process data in different formats like docx, pdf etc and generate insights to be linked with structured data in database?This p…☆14Updated 4 years ago
- Multitask Learning with Pretrained Transformers☆40Updated 4 years ago
- Building Chatbots with Rasa,Spacy,Wit.Ai,etc☆30Updated 6 years ago
- Common crawl pretrained sentencepiece tokenizers for English and Japanese for various vocabulary sizes. Also development environment for …☆10Updated 3 years ago
- Tooling to play around with multilingual machine translation for Indian Languages.☆22Updated 3 years ago
- Table compiling the list of biomedically-related corpora available for named entity recognition (and some also suitable for association d…☆18Updated 7 years ago
- A proposal for creating a reflective listening chatbot☆1Updated 3 years ago
- Transforming textual descriptions into process models using deep learning☆14Updated 6 years ago
- StAtutory Reasoning Assessment☆13Updated 2 years ago
- Extracting narrative timelines (i.e. order and timing of events) from text☆20Updated 6 years ago
- BERT semantic search engine for searching literature research papers for coronavirus covid-19 in google colab☆31Updated 5 years ago
- CodeSwitch is a NLP tool, can use for language identification, pos tagging, name entity recognition, sentiment analysis of code mixed dat…☆35Updated 4 years ago
- Useful tools to extract malayalam text from the Common Crawl Datasets☆28Updated 5 months ago
- Almost state of art text generation library☆66Updated last week
- A short tutorial to map biomedical free-text into UMLS concepts using MetaMap☆28Updated last year