Web Content Extraction Benchmark
☆23Dec 16, 2025Updated 4 months ago
Alternatives and similar repositories for web-content-extraction-benchmark
Users that are interested in web-content-extraction-benchmark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The Open Multilingual Wordnet Project Page☆15May 29, 2023Updated 2 years ago
- Official code and dataset for our NAACL 2024 paper: DialogCC: An Automated Pipeline for Creating High-Quality Multi-modal Dialogue Datase…☆13Jun 24, 2024Updated last year
- ↕️ Intuitive axiomatic retrieval experimentation.☆31Mar 16, 2026Updated last month
- ☆21Jul 25, 2025Updated 8 months ago
- Estimation of party positions from Wikipedia tags (see Herrmann/Döring 2021)☆10Jul 31, 2025Updated 8 months ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- 2018 Computational Text Analysis Notebooks, University of Mannheim☆13Nov 22, 2018Updated 7 years ago
- Web archiving utility library☆11Mar 11, 2026Updated last month
- ☆13Jan 20, 2023Updated 3 years ago
- Code repository for the paper "Mission: Impossible Language Models."☆56Sep 25, 2025Updated 6 months ago
- Calculating Expected Time for training LLM.☆39Apr 17, 2023Updated 3 years ago
- [ACL 2024] This is the code repo for our ACL’24 paper "Cleaner Pretraining Corpus Curation with Neural Web Scraping".☆229Aug 28, 2024Updated last year
- Data Management with SQL for Social Scientists☆11Updated this week
- This repository contains the slides for my short tutorial on cross-lingual supervised text classification I have prepared for the COMPTEX…☆14May 5, 2022Updated 3 years ago
- Data and preprocessing scripts for SemEval 2022 Task 2: Multilingual Idiomaticity Detection and Sentence Embedding☆15Feb 3, 2022Updated 4 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Detecting Concreteness in Natural Language☆16Jan 25, 2024Updated 2 years ago
- An offical implementation of EHRDiff [TMLR]☆33Jun 25, 2024Updated last year
- Official Code Repository for the paper "KALA: Knowledge-Augmented Language Model Adaptation" (NAACL 2022)☆35Oct 17, 2023Updated 2 years ago
- ☆13Dec 16, 2024Updated last year
- Workshop Materials "Advanced Bayesian Statistical Modeling in R and Stan "☆12Nov 23, 2023Updated 2 years ago
- React Native app for DoYou.world☆15Jan 21, 2026Updated 2 months ago
- Social Science Workshop Overview☆17Updated this week
- [npj Digital Medicine'25] Continuous sleep depth index annotation with deep learning yields novel digital biomarkers for sleep health☆16Apr 13, 2025Updated last year
- The Florence Tool CLI provides a command-line interface for processing images using the Florence-2 model. This tool allows users to apply…☆16Jan 21, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆15Updated this week
- ☆12Aug 3, 2022Updated 3 years ago
- Tutorial on Transformers 🤖, HuggingFace 🤗 and Social Science Applications 👥 @ IC2S2☆17Aug 8, 2021Updated 4 years ago
- C# code for "Towards Easier and Faster Sequence Labeling for Natural Language Processing: A Search-based Probabilistic Online Learning Fr…☆13Nov 19, 2018Updated 7 years ago
- Transition-based Dependency Parser with neural networks and hybrid oracle☆13May 14, 2018Updated 7 years ago
- Download, parse, and filter data from Phil Papers. Data-ready for The-Pile.☆19Aug 28, 2023Updated 2 years ago
- The Official Repo for Paper: Aligning Clinical Needs and AI Capabilities: A Survey on LLMs for Medical Reasoning☆22Apr 7, 2026Updated last week
- R code and predictions for the case study from Van Calster et al (Validation Studies of Predictive AI for Use in Medical Practice: Overv…☆21Dec 15, 2025Updated 4 months ago
- the code for drawing.garden☆24Apr 11, 2021Updated 5 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆171May 2, 2024Updated last year
- [ICML 2023] Taxonomy-Structured Domain Adaptation☆12Oct 6, 2023Updated 2 years ago
- Code for "Cross-Domain and Semi-Supervised Named Entity Recognition in Chinese Social Media: A Unified Model"☆18Feb 14, 2022Updated 4 years ago
- Capture webpage and save as image using chromedp☆18Apr 12, 2026Updated last week
- ☆31Apr 9, 2026Updated last week
- TrialPanorama: Developing Large Language Models for Clinical Research Using One Million Clinical Trials☆25Dec 26, 2025Updated 3 months ago
- Blog website built with Next.js, React, TailwindCSS, and Markdown for the blogs. Giscus powers the comments, and the site is integrated w…☆16Sep 18, 2024Updated last year