hitachi-nlp / appjsonifyLinks

A handy PDF-to-JSON conversion tool for academic papers implemented in Python.

☆69

Alternatives and similar repositories for appjsonify

Users that are interested in appjsonify are comparing it to the libraries listed below

Sorting:

TIGER-AI-Lab / StructLM
Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)
☆75Updated 9 months ago
allenai / mmda
multimodal document analysis
☆165Updated last year
Knowledgator / LiqFit
Efficient few-shot learning with cross-encoders.
☆56Updated last year
allenai / aries
Aligned, Review-Informed Edits of Scientific Papers
☆53Updated 2 years ago
davidberenstein1957 / dataset-viber
Dataset Viber is your chill repo for data collection, annotation and vibe checks.
☆47Updated 11 months ago
padas-lab-de / ir-rag-sigir24-persona-rag
☆47Updated 10 months ago
Knowledgator / FlashDeBERTa
Trully flash implementation of DeBERTa disentangled attention mechanism.
☆62Updated 2 months ago
salesforce / summary-of-a-haystack
Codebase accompanying the Summary of a Haystack paper.
☆79Updated 10 months ago
lfoppiano / document-qa
Scientific Document Insight Q/A
☆29Updated last month
urchade / GraphER
GraphER: A Structure-aware Text-to-Graph Model for Entity and Relation Extraction
☆76Updated last year
S1M0N38 / dspy-arxiv
Explore the use of DSPy for extracting features from PDFs 🔎
☆45Updated last year
Knowledgator / utca
Versatile framework designed to streamline the integration of your models, as well as those sourced from Hugging Face, into complex progr…
☆32Updated 3 months ago
ziegler-ingo / CRAFT
Code, datasets, and checkpoints for the paper "CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through Corpus Retrieval an…
☆30Updated 10 months ago
rwightman / genalog
Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and te…
☆42Updated last year
deshwalmahesh / PHUDGE
Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…
☆49Updated last year
IlyasMoutawwakil / py-txi
A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.
☆33Updated 3 months ago
GPT-Laboratory / SLR-automation
To automate the SLR process and write paper quickly using multi agents of AI
☆46Updated last year
MoritzLaurer / zeroshot-classifier
Notebooks for training universal 0-shot classifiers on many different tasks
☆133Updated 7 months ago
robertvacareanu / llm4regression
Examining how large language models (LLMs) perform across various synthetic regression tasks when given (input, output) examples in their…
☆153Updated 10 months ago
princeton-nlp / LitSearch
[EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search
☆93Updated 8 months ago
dswang2011 / DocLLM
DocLLM: A layout-aware generative language model for multimodal document understanding
☆128Updated last year
davanstrien / haiku-dpo
Using open source LLMs to build synthetic datasets for direct preference optimization
☆65Updated last year
AnswerDotAI / ModernBERT-Instruct-mini-cookbook
☆49Updated 5 months ago
davanstrien / data-for-fine-tuning-llms
☆79Updated last year
Muhtasham / summarization-eval
📝 Reference-Free automatic summarization evaluation with potential hallucination detection
☆101Updated last year
deepdoctection / notebooks
Repository for deepdoctection tutorial notebooks
☆46Updated last month
Knowledgator / GLiClass
Generalist and Lightweight Model for Text Classification
☆148Updated last month
datacommonsorg / llm-tools
☆62Updated 6 months ago
allenai / SciRIFF
Dataset and evaluation suite enabling LLM instruction-following for scientific literature understanding.
☆40Updated 4 months ago
stanfordnlp / pdf-struct
Logical structure analysis for visually structured documents
☆91Updated 2 years ago