huridocs / pdf-reading-orderLinks
☆14Updated last year
Alternatives and similar repositories for pdf-reading-order
Users that are interested in pdf-reading-order are comparing it to the libraries listed below
Sorting:
- Seed Machine Translation Data☆33Updated last year
- Small python package to measure OCR quality and other related metrics.☆25Updated last year
- ReadingBank: A Benchmark Dataset for Reading Order Detection☆113Updated last year
- Implementation of Z-BERT-A: a zero-shot pipeline for unknown intent detection.☆44Updated 2 years ago
- No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval☆29Updated 3 years ago
- Post-processing OCR errors with seq2seq models☆28Updated 5 years ago
- DocAI helps developers quickly build document, image and text processing pipelines using open source and cloud-based machine learning mod…☆20Updated 2 years ago
- ☆13Updated 3 years ago
- Deploy DL/ ML inference pipelines with minimal extra code.☆101Updated last year
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆33Updated 2 months ago
- Universal text classifier for generative models☆25Updated last year
- This repository is part of an NLP course for humanities and cultural studies. This course uses historical newspapers as a source and appl…☆18Updated 5 months ago
- Implementation of the DocLLM paper for Llama models.☆13Updated 7 months ago
- ☆14Updated last year
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆49Updated 2 years ago
- ☆44Updated 4 years ago
- Plug-and-play Search Interfaces with Pyserini and Hugging Face☆32Updated 2 years ago
- Showcase how mxbai-embed-large-v1 can be used to produce binary embedding. Binary embeddings enabled 32x storage savings and 40x faster r…☆19Updated last year
- TorchServe+Streamlit for easily serving your HuggingFace NER models☆33Updated 3 years ago
- PyTorch-IE: State-of-the-art Information Extraction in PyTorch☆77Updated last month
- FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction☆24Updated 3 years ago
- High-Performance Transformers for Table Structure Recognition Need Early Convolutions☆43Updated last year
- Template Extraction from unstructured Wikipedia text using NLP techniques.☆41Updated 5 years ago
- Using short models to classify long texts☆21Updated 2 years ago
- Large Scale Distributed Model Training strategy with Colossal AI and Lightning AI☆56Updated 2 years ago
- Efficient few-shot learning with cross-encoders.☆59Updated last year
- FRAKE: Fusional Real-time Automatic Keyword Extraction☆21Updated 2 years ago
- ☆17Updated 4 years ago
- Code for "Incorporating Relevance Feedback for Information-Seeking Retrieval using Few-Shot Document Re-Ranking" (https://arxiv.org/abs/2…☆14Updated 2 years ago
- Simply, faster, sentence-transformers☆143Updated last year