Page Segmentation Code. I'm working with OCRopus and the UW-III data set to test how the page segmentation algorithms work with smaller strips of an image rather than the entire image.
☆20Feb 23, 2013Updated 13 years ago
Alternatives and similar repositories for page_segmentation
Users that are interested in page_segmentation are comparing it to the libraries listed below
Sorting:
- This project deals with hierarchical classification of web pages based on dmoz dataset.☆14Apr 10, 2014Updated 11 years ago
- A semantic web crawler☆20Sep 20, 2010Updated 15 years ago
- Web page segmentation and noise removal☆55Feb 4, 2024Updated 2 years ago
- A recommender system for GitHub repositories☆14Jun 21, 2014Updated 11 years ago
- Collects multimedia content shared through social networks.☆19Feb 18, 2015Updated 11 years ago
- Weighted multiple-instance learning algorithm☆18Oct 9, 2018Updated 7 years ago
- common data interchange format for document processing pipelines that apply natural language processing tools to large streams of text☆35Sep 30, 2016Updated 9 years ago
- Layout Analysis Evaluator for the ICDAR 2017 competition on Layout Analysis for Challenging Medieval Manuscripts☆22May 17, 2019Updated 6 years ago
- 复现论文《Pixel-Anchor: A Fast Oriented Scene Text Detector with Combined Networks》☆26Nov 26, 2018Updated 7 years ago
- ☆70Apr 3, 2018Updated 7 years ago
- memory efficient densenet+lstm+ctc实现中文识别☆31Jun 21, 2022Updated 3 years ago
- Document Classification and Post-OCR Key Value Extraction☆62Nov 6, 2019Updated 6 years ago
- The goal of this experiment is to take articles and certain metadata and group them by topic.☆11Apr 14, 2016Updated 9 years ago
- Sort-friendly URI Reordering Transform (SURT) python module☆44Sep 11, 2025Updated 5 months ago
- This repository is the official implementation of `A Semantic-based Arbitrarily-Oriented Scene Text Detector`(named STD++ as it is the im…☆29Aug 14, 2019Updated 6 years ago
- BlockCAT token sale smart contracts.☆11Oct 19, 2017Updated 8 years ago
- Озвучування тексту українською (ukrainian tts)☆10Oct 25, 2019Updated 6 years ago
- Focused Crawler for VT's CTRNet☆10May 13, 2013Updated 12 years ago
- An attempt at creating a gold standard dataset for backtesting yesterday & today's content-extractors☆35Mar 19, 2015Updated 10 years ago
- Extract (DOM tree) repetitions from a webpage☆12Jan 13, 2014Updated 12 years ago
- Green SqlAlchemy extensions for pulsar☆11Nov 24, 2017Updated 8 years ago
- Arduino library to generate a PWM signal over a shift register (74HC595)☆12Sep 25, 2020Updated 5 years ago
- ☆12Oct 25, 2015Updated 10 years ago
- Amazon Marketing Cloud Insights on AWS helps advertisers and agencies running campaigns on Amazon Ads to easily deploy AWS services to st…☆16Nov 3, 2025Updated 3 months ago
- Bicycle Incident reporting☆13Jul 22, 2022Updated 3 years ago
- Digitization information system build on top of Fedora repository☆16Jan 15, 2019Updated 7 years ago
- Generic framework for historical document processing☆382Jul 9, 2021Updated 4 years ago
- DFT-based text image rotation correction using OpenCV☆39Nov 25, 2013Updated 12 years ago
- "Old SFM" -- manage rules and streams from social data sources, starting with twitter.☆86Aug 10, 2023Updated 2 years ago
- Elevator is an open source, on-disk key-value store. Provides high-performance bulk read-write operations over very large datasets while …☆70May 14, 2014Updated 11 years ago
- A collection of various discourse segmenters☆10Jun 30, 2017Updated 8 years ago
- Gamera 4 for Python 3☆14May 16, 2025Updated 9 months ago
- Rapidly develop your API client☆144Nov 10, 2015Updated 10 years ago
- Uses NLP methods to parse and classify contracts from The City of New Orleans☆10Mar 23, 2015Updated 10 years ago
- Research codes for image interestingness☆17Dec 6, 2017Updated 8 years ago
- An Online Logic Assistant Based on Coq☆25Feb 15, 2012Updated 14 years ago
- agent has moved to https://lab.allmende.io/valueflows/agent☆10Jun 23, 2020Updated 5 years ago
- Spring integration with Stardog RDF database☆18Jan 27, 2025Updated last year
- Go library for accessing the Paddle API☆10Apr 14, 2022Updated 3 years ago