tokgolich / doctotext
Converts DOC, XLS, XLSB, PPT, RTF, ODF (ODT, ODS, ODP), OOXML (DOCX, XLSX, PPTX), iWork (PAGES, NUMBERS, KEYNOTE), ODFXML (FODP, FODS, FODT), PDF, EML and HTML documents to plain text. Extracts metadata and annotations.
☆130Updated 7 years ago
Alternatives and similar repositories for doctotext:
Users that are interested in doctotext are comparing it to the libraries listed below
- DocWire SDK: Award-winning modern data processing in C++20. SourceForge Community Choice & Microsoft support. AI-driven processing. Suppo…☆83Updated last week
- Some much needed maintenance of http://silvercoders.com/en/products/doctotext/☆27Updated 2 years ago
- wv is a library which allows access to Microsoft Word files. It can load and parse Word 2000, 97, 95 and 6 file formats. (These are the f…☆17Updated 7 years ago
- The first open-source C++ development library for OFD.☆62Updated last year
- 文件解析 doctotext 源码 4.0-20140202 版本☆14Updated 8 years ago
- This is not the poppler repository. Please see https://poppler.freedesktop.org/☆54Updated 15 years ago
- C/Python library to extract text from MS doc files☆11Updated 2 years ago
- A light-weight C++ HTML processing library based on pugixml☆19Updated 12 years ago
- ☆52Updated 4 years ago
- GNU character conversion library☆137Updated 6 months ago
- 一个很简陋的ofdEditor☆39Updated 5 years ago
- ☆422Updated 10 years ago
- C++/C library to construct Excel .xls files in code. Official git repo.☆45Updated 4 years ago
- c++ library wrapper of 7zip☆134Updated 2 years ago
- Qt/C++ library based on google pdfium☆15Updated 4 years ago
- Cross platform C/C++ library with C#, Java, Python, Progress 4GL wrappers and command line tools for generating Microsoft Word .DOCX (Ope…☆168Updated 8 years ago
- libchardet - Mozilla's Universal Charset Detector C/C++ API☆112Updated 3 years ago
- PDFium Reader☆67Updated last year
- PDFium library without V8 JavaScript engine - compiles under Linux, Mac and Windows☆61Updated 9 years ago
- RTF to HTML converter for use both with your applications and as a standalone tool. Small and fast. Processes tables better than any othe…☆66Updated 11 months ago
- Modern C++20 library for creating Microsoft Word Document (.docx file).☆125Updated 2 months ago
- Printer Driver for Windows 2000/XP which does: metafiles from spool file for previewing and printing to a real printer and creates metafi…☆16Updated 8 years ago
- VersyPDF is a high-quality, industry-strength PDF library for C/C++ programming languages meeting the requirements of the most demanding …☆243Updated 3 years ago
- 用duilib做的miniblink的浏览器☆101Updated 3 years ago
- cximage 7.0.1 mirror☆81Updated 13 years ago
- A C++17 PDF manipulation library☆461Updated this week
- libiconv Windows build with Visual Studio.☆104Updated last year
- pdf2word: sdk by BCL☆9Updated 8 years ago
- PDF parser☆25Updated 5 years ago
- extract text from MS-WORD's .doc binary format file☆34Updated 2 years ago