floriancochard / extract-data-from-paper
A tool designed to extract numerical data from scanned historical weather documents.
☆13Updated 3 months ago
Alternatives and similar repositories for extract-data-from-paper:
Users that are interested in extract-data-from-paper are comparing it to the libraries listed below
- Unstract's interface to LLMs, Embeddings and VectorDBs.☆18Updated 7 months ago
- Use Google's state-of-the-art T5 pre-train model to create human-like summarization☆25Updated 3 years ago
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆23Updated 5 months ago
- DocAI helps developers quickly build document, image and text processing pipelines using open source and cloud-based machine learning mod…☆20Updated 2 years ago
- Machine Learning-assisted correction of OCR errors in historical corpora☆9Updated 4 months ago
- A swarm of LLM agents that will help you test, document, and productionize your code!☆14Updated 3 weeks ago
- Nougat is a Meta AI's revolutionary OCR model designed to transcribe scientific PDFs into an easy-to-use Markdown format.☆22Updated last year
- Microsoft Phi 2 Streamlit App, deployed on HuggingFace Spaces is based on the Microsoft Phi 2 small language model (SLM) for text generat…☆14Updated 10 months ago
- [WIP] Behold, semantic-search, built over sentence-transformers to make it easy for search engineers to evaluate, optimise and deploy mod…☆15Updated last year
- ChatBot App built using LangChain and Lightning AI☆18Updated 2 years ago
- Automated PDF and text processing with Spacy and NLTK; information extraction from text based on grammatical structure; deployed on extra…☆16Updated 2 years ago
- Solve Geometric & Graph Problems with Large Language Models☆28Updated 2 years ago
- A Python package to get useful information from documents using TopicRank Algorithm.☆16Updated last year
- ☆11Updated last year
- WhisperAnywhere: Effortless speech-to-text everywhere on your Mac. Use a hotkey to dictate in any app, powered by Whisper AI and Groq API…☆15Updated 4 months ago
- ChatGPT on your own data☆24Updated last year
- An implementation of Tiling and Corruption (TACo) Augmentations for OCR/HTR☆15Updated 3 years ago
- ☆13Updated last month
- Translate any text using GPT.☆16Updated last year
- 🚂 Fine-tune OpenAI models for text classification, question answering, and more☆16Updated last year
- PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"☆23Updated 3 weeks ago
- Google Colab Demo of CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents☆46Updated 3 years ago
- An intelligent OCR to detect tables and pure text inside PDFs and obtaing a csv file and a txt from it☆14Updated 6 years ago
- Prompt Engineering for Large Language Models - Notebooks, Demos, Exercises, and Projects☆22Updated last year
- Easy formatted text extraction from images using Google Vision API☆41Updated 3 years ago
- Chat Complex PDF with Tables Using IBM WatsonX, Langchain and LlamaParser.☆11Updated 10 months ago
- Example Code to Supplement the Label Studio Blog☆22Updated 3 weeks ago
- Using the adjacency matrix and random forest get the Name, Address, Items, Prices, Grand total from all kind of invoices.☆18Updated 5 years ago