Download, parse, and filter data from Phil Papers. Data-ready for The-Pile.
☆20Aug 28, 2023Updated 2 years ago
Alternatives and similar repositories for The-Pile-PhilPapers
Users that are interested in The-Pile-PhilPapers are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Materials for 2021 Workshop on Text and Network Methods☆12Jun 16, 2022Updated 4 years ago
- [ACL 2025] Analyzing LLMs' Multilingual Knowledge Boundary Cognition Across Languages Through the Lens of Internal Representations☆19Oct 18, 2025Updated 8 months ago
- Examples for using the SiLLM framework for training and running Large Language Models (LLMs) on Apple Silicon☆16May 8, 2025Updated last year
- speech to text gui for different (e.g. Whisper, Voxtral) models and backends, including whisper.cpp, crispasar, mlx-whisper, faster-whisp…☆22May 30, 2026Updated 2 weeks ago
- Materials from the NLPCSS 201 Social Media Preprocessing Tutorial, March 16, 2022☆13Nov 10, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- R package for Email Data Processing☆15Mar 1, 2018Updated 8 years ago
- Source code used in the blog☆12Feb 6, 2024Updated 2 years ago
- Extendable Scratch3 Programming Environment☆10Jun 12, 2026Updated last week
- Workshop "Analyzing Social Media Data" at the Big Data and Development Conference☆11Sep 11, 2023Updated 2 years ago
- ☆13Sep 9, 2020Updated 5 years ago
- ☆17Oct 8, 2022Updated 3 years ago
- Why R? 2021 Turkey konferansında sunulan çalışmaların özet, sunum ve video kayıtlarını içerir.☆11Apr 26, 2021Updated 5 years ago
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆32Oct 9, 2025Updated 8 months ago
- alternative remote for Lego Boost with Pythonista and iOS☆10Aug 27, 2017Updated 8 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆28Jul 18, 2025Updated 11 months ago
- https://footprints.baulab.info☆18Oct 4, 2024Updated last year
- YT_subtitles - extracts subtitles from YouTube videos to raw text for Language Model training☆46Sep 22, 2020Updated 5 years ago
- Repo of materials for Oxford Spring School in Advanced Research Methods: Analysing Twitter Data☆14Apr 22, 2022Updated 4 years ago
- Web archiving utility library☆11May 5, 2026Updated last month
- ☆21Feb 9, 2022Updated 4 years ago
- A web application for studying Ancient Greek texts with integrated lexical, syntactic, and morphological analysis tools.☆21Dec 1, 2025Updated 6 months ago
- Risale-i Nur Külliyatı’nın, Diyanet –asıl nüsha tashihli– metni dijital ortamda!☆24Dec 16, 2024Updated last year
- Text as Data Course Taught at Yale University, November 15 2019☆15Nov 15, 2019Updated 6 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- A framework for few-shot evaluation of autoregressive language models.☆13Feb 14, 2024Updated 2 years ago
- ☆14Oct 6, 2025Updated 8 months ago
- Dataset Catalogue Homepage for Indonesian Languages☆12Feb 19, 2024Updated 2 years ago
- ☆17Apr 11, 2024Updated 2 years ago
- Using LEGO EV3 MicroPyhton with MQTT☆12Apr 29, 2019Updated 7 years ago
- R package for estimating political networks☆19Dec 10, 2020Updated 5 years ago
- Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients☆27Sep 10, 2024Updated last year
- Code for paper "Point and Ask: Incorporating Pointing into Visual Question Answering"☆19Oct 4, 2022Updated 3 years ago
- End-to-End-Graphrag-implementation☆24Jul 16, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Semeval-2021 Multilingual and Cross-lingual Word-in-Context Task☆18May 27, 2021Updated 5 years ago
- ☆25Dec 19, 2022Updated 3 years ago
- Estimation of Difference-in-Differences Treatment Effects with Staggered Treatment Onset Using Heterogeneity-Robust Two-Way Fixed Effects…☆21Feb 5, 2025Updated last year
- Flask Interface to Thompson's Motif Index☆20Jul 9, 2019Updated 6 years ago
- Röttger et al. (2024): "IssueBench: Millions of Realistic Prompts for Measuring Issue Bias in LLM Writing Assistance"☆16Mar 6, 2026Updated 3 months ago
- [ACL 2025] 🔍 Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment☆11Apr 6, 2025Updated last year
- ☆44Aug 11, 2025Updated 10 months ago