keithrbennett / rika
A JRuby command line application and library for Apache Tika to extract text and metadata from files of various formats.
☆53Updated last month
Alternatives and similar repositories for rika:
Users that are interested in rika are comparing it to the libraries listed below
- annoy-rb provides Ruby bindings for the Annoy (Approximate Nearest Neighbors Oh Yeah).☆35Updated 2 months ago
- Additional sidekiq middleware☆92Updated 8 years ago
- High speed text tokenization for Ruby☆68Updated 3 months ago
- Breakout detection for Ruby☆46Updated 3 months ago
- Filename sanitization for Ruby☆224Updated last year
- Ruby HTML sanitizer based on a lightweight Oga parser.☆39Updated 4 months ago
- Simple configuration library that works well with ENV vars and config files☆23Updated 2 years ago
- Fast, pure-Ruby Aho-Corasick string search☆32Updated 4 months ago
- Nice output to console/file from concurrent threads☆45Updated 3 years ago
- Ruby Agent for Instrumental Application Monitoring☆59Updated 4 years ago
- biggs is a small ruby gem/rails plugin for formatting postal addresses from over 60 countries.☆149Updated last year
- Quick #rebuild! method implementation for closure_tree on PostgreSQL☆41Updated 9 years ago
- Various distance and similarity measures for machine learning.☆30Updated 4 years ago
- Bundler plugin for showing gem diffs☆44Updated 2 months ago
- Ruby Scoring API for PMML☆69Updated 2 years ago
- Parse Accept and Accept-Language HTTP headers in Ruby.☆84Updated 3 weeks ago
- Flexible configuration for Ruby applications☆67Updated 3 years ago
- Edge stream anomaly detection for Ruby☆54Updated 3 months ago
- resque plugin to add unique jobs☆35Updated last year
- Stemming for Ruby, powered by Snowball☆27Updated 3 months ago
- A gem that allows for you to write specs for your Rails 3 generators☆89Updated last year
- Super-fast router class for Rack application, derived from Keight.rb.☆31Updated last year
- High-performance time series algorithms for Ruby☆37Updated 5 months ago
- Recurring / Periodic / Scheduled / Cron job extension for Sidekiq☆88Updated last year
- Declarative input schemas for Ruby apps.☆39Updated 9 months ago
- Store text data in different languages. Similar to globalize, but uses PostgreSQL's JSONB to store data in a single field. No additional …☆57Updated this week
- is_crawler does exactly what you might think it does: determine if the supplied string matches a known crawler or bot.☆31Updated 6 years ago
- Eliminates the drudgery of handcrafting an `autoload` statement for each Ruby source code file in your project☆50Updated 11 months ago
- Rails console history for Heroku, Docker, and more☆80Updated last month
- Fast computation of descriptive statistics in ruby using native code and SIMD☆60Updated last year