Index of URLs to pdf files all over the internet and scripts
☆25May 2, 2023Updated 3 years ago
Alternatives and similar repositories for CCpdf
Users that are interested in CCpdf are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆60Aug 18, 2021Updated 4 years ago
- JSON Schema format for storing datasets details, documents processed contents, and documents annotations in the document understanding do…☆14Nov 5, 2024Updated last year
- A fast and highly accurate differentiable Top-k operator from the "Successive Halving Top-k Operator" AAAI'21 paper.☆16Jun 1, 2021Updated 4 years ago
- Training data for the NLPContributionGraph Shared Task 11 at SemEval-2021☆14Jan 11, 2021Updated 5 years ago
- Web archiving utility library☆11Mar 11, 2026Updated last month
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆18Jul 7, 2025Updated 9 months ago
- Analyse des Pegida facebook Korpus☆10Jan 31, 2015Updated 11 years ago
- The most comprehensive Chinese Telegraph Code table☆12Jul 5, 2015Updated 10 years ago
- weixin125个人健康数据管理系统的设计与实现微信小程序+ssm后端毕业源码案例设计☆11Feb 28, 2024Updated 2 years ago
- We enable LLM with personalization capability☆11Nov 16, 2023Updated 2 years ago
- Dataset used to evaluate Skill Extraction systems based on the ESCO skills taxonomy.☆17Jul 18, 2024Updated last year
- Implementation of the GLOM model for text☆11Mar 4, 2021Updated 5 years ago
- Benchmark dataset for the evaluation of scientific article representations on the task of citation recommendation across various scientif…☆12Oct 21, 2022Updated 3 years ago
- multimodal document analysis☆165Feb 28, 2026Updated 2 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- I have created a dataset of Image-Text-Pairs by using the cosine similarity of the CLIP embeddings of the image & it's caption derrived f…☆16Apr 22, 2021Updated 5 years ago
- utilities for loading and running text embeddings with onnx☆45Aug 16, 2025Updated 8 months ago
- Structured Multi-task Learning for Molecular Property Prediction, AISTATS'22 (https://proceedings.mlr.press/v151/liu22e.html)☆14Jul 6, 2022Updated 3 years ago
- Curated list of awesome datasets for various table understanding tasks☆18Sep 5, 2025Updated 8 months ago
- ☆18Jun 7, 2021Updated 4 years ago
- Record animations on HTML5 canvas☆14Apr 16, 2024Updated 2 years ago
- ☆17Dec 11, 2023Updated 2 years ago
- Single-line inference of SOTA deep learning models☆29Jan 22, 2023Updated 3 years ago
- Code for the ICDAR2021 paper "Visual FUDGE: Form Understanding via Dynamic Graph Editing"☆33Mar 4, 2022Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A set of utilities to turn Dataclasses into useful configuration managers.☆11Mar 27, 2024Updated 2 years ago
- Code for the paper "Modeling Information Change in Science Communication with Semantically Matched Paraphrases" from EMNLP 2022☆13Oct 20, 2022Updated 3 years ago
- Collecting good beginner tasks and project ideas.☆16Apr 23, 2018Updated 8 years ago
- [Paper] Code for the EMNLP2023 (Findings) paper "Global Structure Knowledge-Guided Relation Extraction Method for Visually-Rich Document"☆17Dec 1, 2023Updated 2 years ago
- GC4LM: A Colossal (Biased) language model for German☆13May 2, 2021Updated 5 years ago
- SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images (AAAI2023)☆105Mar 31, 2025Updated last year
- DSIR large-scale data selection framework for language model training☆273Apr 7, 2024Updated 2 years ago
- Scripts for downloading and pre-processing the `proof-pile`, a high quality dataset of mathematical text and code.☆22Nov 26, 2022Updated 3 years ago
- Scaffolding for multi-user Elm applications via Gulp, Express, and SockJS.☆10Apr 10, 2015Updated 11 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆10Apr 4, 2023Updated 3 years ago
- A script for collecting the PubMed Central dataset in a language modelling friendly format.☆26Feb 16, 2021Updated 5 years ago
- The pipeline for the OSCAR corpus☆177Nov 9, 2025Updated 5 months ago
- ☆10Jan 20, 2024Updated 2 years ago
- Program Translator AI built on Pytorch☆15Dec 19, 2019Updated 6 years ago
- A Living Papers article starter template.☆25Nov 3, 2023Updated 2 years ago
- Data and code for the paper "CiteWorth: Cite-Worthiness Detection for Improved Scientific Document Understanding"☆14Sep 8, 2022Updated 3 years ago