hitachi-nlp / appjsonifyLinks
A handy PDF-to-JSON conversion tool for academic papers implemented in Python.
☆70Updated last year
Alternatives and similar repositories for appjsonify
Users that are interested in appjsonify are comparing it to the libraries listed below
Sorting:
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆96Updated 9 months ago
- To automate the SLR process and write paper quickly using multi agents of AI☆47Updated last year
- [TACL] Code, datasets, and checkpoints for the paper "CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through Corpus Retri…☆31Updated last year
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆75Updated 11 months ago
- Interact with the Deep Search platform for new knowledge explorations and discoveries☆214Updated 7 months ago
- Codebase accompanying the Summary of a Haystack paper.☆79Updated 11 months ago
- Scientific Document Insight Q/A☆30Updated 2 weeks ago
- A Python library to chunk/group your texts based on semantic similarity.☆96Updated last year
- Notebooks for training universal 0-shot classifiers on many different tasks☆137Updated 8 months ago
- multimodal document analysis☆166Updated last year
- Efficient few-shot learning with cross-encoders.☆58Updated last year
- ☆80Updated last year
- Create fast graph language models from converted PDF documents for knowledge extraction and Q&A.☆56Updated 7 months ago
- Generalist and Lightweight Model for Text Classification☆157Updated 3 months ago
- a curated list of the role of small models in the LLM era☆104Updated 11 months ago
- DocLLM: A layout-aware generative language model for multimodal document understanding☆129Updated last year
- Examining how large language models (LLMs) perform across various synthetic regression tasks when given (input, output) examples in their…☆154Updated last year
- Resources related to EACL 2023 paper "SwitchPrompt: Learning Domain-Specific Gated Soft Prompts for Classification in Low-Resource Domain…☆52Updated 2 years ago
- Repository for deepdoctection tutorial notebooks☆46Updated 3 months ago
- Official repository for RAGViz: Diagnose and Visualize Retrieval-Augmented Generation [EMNLP 2024]☆85Updated 8 months ago
- TF-ID: Table/Figure IDentifier for academic papers☆240Updated last year
- Aligned, Review-Informed Edits of Scientific Papers☆53Updated 2 years ago
- ☆50Updated 11 months ago
- Universal text classifier for generative models☆24Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated last year
- ☆67Updated last year
- Improving Text Embedding of Language Models Using Contrastive Fine-tuning☆64Updated last year
- This project is a collection of fine-tuning scripts to help researchers fine-tune Qwen 2 VL on HuggingFace datasets.☆73Updated 2 months ago
- Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and te…☆42Updated last year
- Advanced Reasoning Benchmark Dataset for LLMs☆47Updated last year