davendw49 / sciparser
PDF parsing toolkit for preparing academic text corpus
☆54Updated 7 months ago
Alternatives and similar repositories for sciparser:
Users that are interested in sciparser are comparing it to the libraries listed below
- All in one PDF Parser Toolkit☆16Updated last year
- Code and datasets for paper "K2: A Foundation Language Model for Geoscience Knowledge Understanding and Utilization" in WSDM-2024☆182Updated 8 months ago
- LLM for Scientific Research Survey☆48Updated 3 weeks ago
- [ICLR24] The open-source repo of THU-KEG's KoLA benchmark.☆50Updated last year
- GAKG is a multimodal Geoscience Academic Knowledge Graph (GAKG) framework by fusing papers' illustrations, text, and bibliometric data.☆49Updated 7 months ago
- [EMNLP2024] Aligning Large Language Models on Information Extraction☆41Updated 3 months ago
- SciGLM: Training Scientific Language Models with Self-Reflective Instruction Annotation and Tuning (NeurIPS D&B Track 2024)☆78Updated 11 months ago
- The code and data for "StructGPT: A general framework for Large Language Model to Reason on Structured Data"☆101Updated 11 months ago
- A large-scale language model for scientific domain, trained on redpajama arXiv split☆129Updated 11 months ago
- TianGong-AI-Unstructure☆58Updated 3 weeks ago
- Official repository for paper "TableBench: A Comprehensive and Complex Benchmark for Table Question Answering"☆38Updated 4 months ago
- Repo for ACL2023 paper "Plug-and-Play Knowledge Injection for Pre-trained Language Models"☆60Updated 10 months ago
- ☆139Updated 7 months ago
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆141Updated 5 months ago
- [ACL 2023] This is the code repo for our ACL'23 paper "Augmentation-Adapted Retriever Improves Generalization of Language Models as Gener…☆59Updated 7 months ago
- Code for "A Simple but Effective Approach to Improve Structured Language Model Output for Information Extraction"☆14Updated 11 months ago
- Paper list of "The Life Cycle of Knowledge in Big Language Models: A Survey"☆59Updated last year
- ☆54Updated 4 months ago
- This is a meta-model distilled from LLMs for information extraction. This is an intermediate checkpoint that can be well-transferred to a…☆24Updated 3 months ago
- [ACL 2024] OceanGPT: A Large Language Model for Ocean Science Tasks☆37Updated 6 months ago
- ☆36Updated 5 months ago
- ☆53Updated 3 months ago
- [Paper][ACL 2024 Findings] Knowledgeable Preference Alignment for LLMs in Domain-specific Question Answering☆188Updated 8 months ago
- ☆137Updated last year
- Zero-shot KGQA method based on curiosity-driven graph exploration. Agarwal et al., "Bring Your Own KG: Self-Supervised Program Synthesis …☆27Updated 7 months ago
- MemoChat: Tuning LLMs to Use Memos for Consistent Long-Range Open-Domain Conversation☆21Updated 10 months ago
- 🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.☆136Updated 7 months ago
- This is the official code for paper: [PiVe: Prompting with Iterative Verification Improving Graph-based Generative Capability of LLMs]☆33Updated 5 months ago
- ☆80Updated last year
- ☆22Updated last year