apache / tika
The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).
☆2,830Updated this week
Alternatives and similar repositories for tika:
Users that are interested in tika are comparing it to the libraries listed below
- Mirror of Apache PDFBox☆2,772Updated this week
- Apache Solr open-source search software☆1,327Updated this week
- Apache Lucene open-source search software☆2,883Updated this week
- Apache Lucene and Solr open-source search software☆4,373Updated 5 months ago
- Mirror of Apache POI☆1,998Updated this week
- Apache Nutch is an extensible and scalable web crawler☆2,987Updated 2 months ago
- Apache Freemarker☆1,013Updated last week
- Apache ActiveMQ Classic☆2,345Updated this week
- Apache HBase☆5,303Updated this week
- Elasticsearch File System Crawler (FS Crawler)☆1,377Updated this week
- Ehcache 3.x line☆2,037Updated 2 months ago
- Apache Tomcat☆7,746Updated this week
- Apache ZooKeeper☆12,409Updated this week
- iText for Java represents the next level of SDKs for developers that want to take advantage of the benefits PDF can bring. Equipped with …☆2,068Updated this week
- Zipkin is a distributed tracing system☆17,139Updated 3 weeks ago
- Drools is a rule engine, DMN engine and complex event processing (CEP) engine for Java.☆5,967Updated last week
- Apache OpenNLP☆1,486Updated last week
- The reliable, generic, fast and flexible logging framework for Java.☆3,087Updated last week
- Official Elasticsearch Java Client☆448Updated this week
- Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.☆1,560Updated 11 months ago
- Mirror of Apache Mahout☆2,161Updated this week
- Eclipse Jetty® - Web Container & Clients - supports HTTP/2, HTTP/1.1, HTTP/1.0, websocket, servlets, and more☆3,900Updated this week
- JODConverter automates document conversions using LibreOffice or Apache OpenOffice.☆1,452Updated 6 months ago
- Apache Avro is a data serialization system.☆3,024Updated this week
- Simple Logging Facade for Java☆2,392Updated 3 weeks ago
- An API Gateway built on Spring Framework and Spring Boot providing routing and more.☆4,610Updated this week
- Feign makes writing java http clients easier☆9,611Updated this week
- Apache Pulsar - distributed pub-sub messaging system☆14,495Updated this week
- Provides Familiar Spring Abstractions for Apache Kafka☆2,279Updated this week
- Apache Geode☆2,300Updated last month