☆23Apr 8, 2026Updated this week
Alternatives and similar repositories for OpenWebTextCorpus
Users that are interested in OpenWebTextCorpus are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆11Feb 3, 2025Updated last year
- A simple python library to take first names and return their gender using the genderize.io API.☆30May 20, 2023Updated 2 years ago
- code to remove "noise" from hOCR output of Tesseract OCR.☆14Oct 24, 2016Updated 9 years ago
- ☆13May 2, 2018Updated 7 years ago
- Resources for grounding protein families and complexes from text and describing their hierarchical relationships.☆18Mar 26, 2026Updated 2 weeks ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Using Huggingface to generate relation expressions☆15Jan 15, 2021Updated 5 years ago
- official diybookscanner repository☆39May 11, 2014Updated 11 years ago
- Data and code for paper "ODSum: New Benchmarks for Open Domain Multi-Document Summarization"☆11Sep 20, 2024Updated last year
- A Redis module that helps you calculate the real rating from positive/negative rating feedback.☆19Jul 31, 2017Updated 8 years ago
- ☆10Dec 9, 2018Updated 7 years ago
- ☆32Mar 13, 2024Updated 2 years ago
- ☆16Feb 10, 2026Updated 2 months ago
- ☆13Oct 13, 2022Updated 3 years ago
- MFAQ: a Multilingual FAQ Dataset☆18Sep 17, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Simple, Fast, Scalable , production grade dashboard application . Right solution for team☆14Jul 26, 2024Updated last year
- [EMNLP 2021] Efficient Contrastive Learning via Novel Data Augmentation and Curriculum Learning☆17Jun 28, 2025Updated 9 months ago
- Wikipedia Live Monitor☆22Dec 21, 2024Updated last year
- ☆23Apr 24, 2013Updated 12 years ago
- A web application for playing 20 Questions to crowdsource common sense. 🤖☆16Sep 29, 2022Updated 3 years ago
- A framework to allow the matching of string entities using customised sets of transformations and matchers, plus a tool to produce the ne…☆34Apr 18, 2017Updated 8 years ago
- Human-Powered Data Analysis with Mechanical Turk☆299Nov 28, 2012Updated 13 years ago
- (Machine) Learning to Do More with Less☆14Jun 11, 2018Updated 7 years ago
- Visualizing Intergenerational Wealth Mobility and Racial Inequality☆10Mar 21, 2019Updated 7 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- CoNLL-U format library for Python☆15Apr 7, 2015Updated 11 years ago
- SciAI is an extension for the text editors like Google Docs to create structured semantic biomedical papers directly at the moment of wri…☆22Jul 11, 2023Updated 2 years ago
- Implementation of BERT that could load official pre-trained models for feature extraction and prediction on TPU☆17Feb 24, 2019Updated 7 years ago
- Vossian Antonomasia☆10Oct 17, 2025Updated 5 months ago
- ☆69Oct 5, 2022Updated 3 years ago
- Language Model and Text Classification for German Language using Deep Learning☆18Jun 15, 2018Updated 7 years ago
- import information (affiliation, education) from ORCID database to Wikidata regarding authors of scientific papers☆16May 25, 2023Updated 2 years ago
- Documentation tool for Markdown conversion by Jupyter client☆16Mar 22, 2025Updated last year
- OpenRefine Reconciliation Framework in Python and Flask☆22May 1, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Open Source implementation of the GA4GH htsget protocol for objects stored in Google Cloud Storage☆25Nov 19, 2019Updated 6 years ago
- codes for "Scheduled Sampling Based on Decoding Steps for Neural Machine Translation" (long paper of EMNLP-2022)☆20Aug 31, 2021Updated 4 years ago
- Link python and excel. Automate excel calculations, charts, Tables☆11Nov 9, 2020Updated 5 years ago
- SmallK: very fast data clustering tools☆14Apr 3, 2019Updated 7 years ago
- A project to capture biological pathway data from academic papers☆29Mar 24, 2025Updated last year
- ☆22Nov 4, 2017Updated 8 years ago
- Links to example code downloads for Learning Path: Get Started with Natural Language Processing Using Python, Spark, and Scala☆17Feb 23, 2017Updated 9 years ago