dsdanielpark / arxiv2textLinks
Converting PDF files to text, mainly with a focus on arXiv papers.
β23Updated last year
Alternatives and similar repositories for arxiv2text
Users that are interested in arxiv2text are comparing it to the libraries listed below
Sorting:
- LLM reads a paper and produce a working prototypeβ57Updated 6 months ago
- Explore the use of DSPy for extracting features from PDFs πβ46Updated last year
- β50Updated last year
- π§ Compare how Agent systems perform on several benchmarks. ππβ102Updated 2 months ago
- Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Modelsβ99Updated last year
- Official code for ACL 2023 (short, findings) paper "Recursion of Thought: A Divide and Conquer Approach to Multi-Context Reasoning with Lβ¦β45Updated 2 years ago
- Source code of the paper: RetrievalQA: Assessing Adaptive Retrieval-Augmented Generation for Short-form Open-Domain Question Answering [Fβ¦β68Updated last year
- Small and Efficient Mathematical Reasoning LLMsβ72Updated last year
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding forβ¦β27Updated 10 months ago
- Lightweight demos for finetuning LLMs. Powered by π€ transformers and open-source datasets.β78Updated 11 months ago
- Mixing Language Models with Self-Verification and Meta-Verificationβ109Updated 10 months ago
- β77Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β50Updated last year
- Codebase accompanying the Summary of a Haystack paper.β79Updated last year
- A framework for high-fidelity retrieval augmented generation in industrial knowledge bases. Integrates jargon identification, context recβ¦β35Updated last year
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Modelsβ22Updated 10 months ago
- Weekly visualization report of Open LLM model performance based on 4 metrics.β86Updated last year
- Jina VDR is a multilingual, multi-domain benchmark for visual document retrievalβ30Updated 2 months ago
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"β115Updated last year
- Source code for our paper: "SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals".β69Updated last year
- β78Updated 9 months ago
- LLM-Training-API: Including Embeddings & ReRankers, mergekit, LaserRMTβ27Updated last year
- Using open source LLMs to build synthetic datasets for direct preference optimizationβ66Updated last year
- Repository for βPlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makersβ, NAACL24β147Updated last year
- The first dense retrieval model that can be prompted like an LMβ89Updated 5 months ago
- Code for "Democratizing Reasoning Ability: Tailored Learning from Large Language Model", EMNLP 2023β36Updated last year
- A set of utilities for running few-shot prompting experiments on large-language modelsβ123Updated last year
- HuggingChat like UI in Gradioβ70Updated 2 years ago
- β35Updated 2 years ago
- Pre-training code for CrystalCoder 7B LLMβ55Updated last year