davendw49 / sciparserLinks

PDF parsing toolkit for preparing academic text corpus

☆58

Alternatives and similar repositories for sciparser

Users that are interested in sciparser are comparing it to the libraries listed below

Sorting:

davendw49 / k2
Code and datasets for paper "K2: A Foundation Language Model for Geoscience Knowledge Understanding and Utilization" in WSDM-2024
☆195Updated last year
Acemap / pdf_parser
All in one PDF Parser Toolkit
☆16Updated last year
davendw49 / gakg
GAKG is a multimodal Geoscience Academic Knowledge Graph (GAKG) framework by fusing papers' illustrations, text, and bibliometric data.
☆53Updated 11 months ago
gmftbyGMFTBY / science-llm
A large-scale language model for scientific domain, trained on redpajama arXiv split
☆133Updated last year
THU-KEG / KoLA
[ICLR24] The open-source repo of THU-KEG's KoLA benchmark.
☆50Updated last year
THUDM / SciGLM
SciGLM: Training Scientific Language Models with Self-Reflective Instruction Annotation and Tuning (NeurIPS D&B Track 2024)
☆79Updated last year
JBoRu / StructGPT
The code and data for "StructGPT: A general framework for Large Language Model to Reason on Structured Data"
☆103Updated last year
thunlp / Knowledge-Plugin
Repo for ACL2023 paper "Plug-and-Play Knowledge Injection for Pre-trained Language Models"
☆61Updated last year
c-box / KnowledgeLifecycle
Paper list of "The Life Cycle of Knowledge in Big Language Models: A Survey"
☆59Updated last year
Zheng0428 / COIG-Kun
☆36Updated 9 months ago
thu-coai / CritiqueLLM
☆142Updated 11 months ago
du-nlp-lab / LLM4SR
LLM for Scientific Research Survey
☆96Updated 5 months ago
OpenMatch / Augmentation-Adapted-Retriever
[ACL 2023] This is the code repo for our ACL'23 paper "Augmentation-Adapted Retriever Improves Generalization of Language Models as Gener…
☆60Updated 11 months ago
lfy79001 / TableQAKit
A Toolkit for Table-based Question Answering
☆112Updated last year
geobrain-ai / geogalactica
Code and datasets for paper "GeoGalactica: A Scientific Large Language Model in Geoscience"
☆32Updated last year
thunlp / Adaptive-Note
☆57Updated 8 months ago
cavalierlulu / rag_survey
☆124Updated last year
amazon-science / robust-tableqa
Two approaches for robust TableQA: 1) ITR is a general-purpose retrieval-based approach for handling long tables in TableQA transformer m…
☆39Updated last year
gydpku / PPTC
PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task Completion
☆54Updated last year
Junjie-Ye / ToolEyes
[COLING 2025] ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios
☆68Updated last month
zexuanqiu / CLongEval
CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models
☆40Updated last year
Spico197 / Humpback
🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.
☆140Updated last month
Abbey4799 / CELLO
Code and data for the paper "Can Large Language Models Understand Real-World Complex Instructions?"(AAAI2024)
☆48Updated last year
EdisonNi-hku / chatreport
Github implementation of https://reports.chatclimate.ai/
☆20Updated last week
kongds / scaling_sentemb
Scaling Sentence Embeddings with Large Language Models
☆110Updated last year
zhxlia / Awesome-TableReasoning-LLM-Survey
☆41Updated 10 months ago
hyintell / awesome-refreshing-llms
EMNLP'23 survey: a curation of awesome papers and resources on refreshing large language models (LLMs) without expensive retraining.
☆133Updated last year
tan92hl / Complex-Question-Answering-Evaluation-of-GPT-family
A large-scale complex question answering evaluation of ChatGPT and similar large-language models
☆40Updated last year
OpenMOSS / HalluQA
Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"
☆130Updated last year
Wusiwei0410 / SciMMIR
☆22Updated 10 months ago