huridocs / pdf-reading-order
☆11Updated 6 months ago
Related projects ⓘ
Alternatives and complementary repositories for pdf-reading-order
- Small python package to measure OCR quality and other related metrics.☆21Updated 9 months ago
- Segmenting a given document using recursive xy-cut algorithm.☆12Updated 6 years ago
- PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"☆23Updated last week
- GPT-jax based on the official huggingface library☆13Updated 3 years ago
- ☆14Updated last month
- Rust bindings for CTranslate2☆13Updated last year
- Large-scale query-focused multi-document Summarization dataset☆10Updated 3 years ago
- Index of URLs to pdf files all over the internet and scripts☆21Updated last year
- NLG Best Practices for Data-Efficient Modeling How to Train Production-Ready Models with Little Data☆11Updated 3 years ago
- I have created a dataset of Image-Text-Pairs by using the cosine similarity of the CLIP embeddings of the image & it's caption derrived f…☆15Updated 3 years ago
- Using short models to classify long texts☆20Updated last year
- 🚀🤗 A collection of templates for Hugging Face Spaces☆35Updated last year
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…☆23Updated 7 months ago
- A dashboard for exploring timm learning rate schedulers☆18Updated last year
- ☆11Updated 2 years ago
- The collection of bulding blocks building fine-tunable metric learning models☆32Updated last month
- Lightweight Non-Parametric Embedding Fine-Tuning☆17Updated last month
- Open sourced backend for Martian's LLM Inference Provider Leaderboard☆17Updated 3 months ago
- In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning☆33Updated last year
- Experiments for XLM-V Transformers Integeration☆13Updated last year
- Zeta implementation of a reusable and plug in and play feedforward from the paper "Exponentially Faster Language Modeling"☆15Updated last week
- ☆20Updated last year
- code for paper "Accessing higher dimensions for unsupervised word translation"