Data and information related to the Books3 dataset included as part of The Pile, and used to train Meta's LLaMA among others
☆36May 10, 2025Updated 11 months ago
Alternatives and similar repositories for Books3Info
Users that are interested in Books3Info are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This repository includes the implementation and results of the paper "ChatGPT is fun, but it is not funny! Humor is still challenging Lar…☆13Jul 13, 2023Updated 2 years ago
- Extremely fast pure-javascript bloom filter for node and browsers☆10Nov 7, 2017Updated 8 years ago
- Specifications for APIs we plan to develop for the Bokmålsordboka | Nynorskordboka dictionary. These APIs are neither open nor implement…☆15May 22, 2017Updated 8 years ago
- Crawling engine that crawls a set of top-level domains looking for documents in a list of languages☆11Feb 6, 2024Updated 2 years ago
- The Python Tutorials repository is where I share insightful tutorials on data science and analytics using Python, along with helpful Pyth…☆10Mar 17, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- API to count unique words in german and english texts☆12Dec 8, 2022Updated 3 years ago
- Towards Few-Shot Fact-Checking via Perplexity☆13Jun 11, 2021Updated 4 years ago
- ☆16Jul 27, 2025Updated 9 months ago
- Notebook seen in Jeremy Howard's keynote at posit::conf(2023)☆19Sep 21, 2023Updated 2 years ago
- A collection of example scripts that demonstrate different ways of using Ragie within your project☆13Jan 14, 2025Updated last year
- An experiment replicating part of "Why Literary Time is Measured in Minutes" with GPT-4.☆34Mar 19, 2023Updated 3 years ago
- Repository for collecting analyses and results for tidytuesday from CorrelAid members☆10Apr 11, 2023Updated 3 years ago
- Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]☆32Jan 23, 2025Updated last year
- map and analyze common Milwaukee architectural styles☆11Mar 21, 2022Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Link AniDB, MAL, ANN and Anilist IDs☆17Apr 5, 2024Updated 2 years ago
- Starter Code (R and Python) for all CSV data sets of opendata.swiss☆12Feb 22, 2026Updated 2 months ago
- #TidyTuesday is a weekly social data project in R which encourages participants to summarize and arrange data to make meaningful charts w…☆14Jun 10, 2025Updated 10 months ago
- Deprecated! Library, CLI, and Discord bot for the unofficial ChatGPT API with progressive responses and more.☆11Dec 15, 2023Updated 2 years ago
- ☆33Jun 21, 2022Updated 3 years ago
- prbot: Pull Request robot☆13Mar 16, 2016Updated 10 years ago
- This app collects data from OSM(open street maps). You can change queries according to your need and use it for data extraction.☆13Apr 29, 2020Updated 6 years ago
- A collection of reproducible inference engine benchmarks☆38Apr 22, 2025Updated last year
- Material for the CorrelCon 2020 Advanced Session "Building a modularized Shiny app with the golem 📦 and html widgets"☆12Mar 10, 2021Updated 5 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Generates time-based transcripts from parliamentary protocols published via the Bundestag Open Data service (https://www.bundestag.de/ser…☆14Jun 7, 2018Updated 7 years ago
- Interface to gather newspaper articles from ZEIT ONLINE, based on a multilevel query. Including sorting algorithms and graphical output …☆10Aug 11, 2018Updated 7 years ago
- Firecracker VM orchestration for Claude Code sessions☆27Mar 30, 2026Updated last month
- Technical documentation guidelines for Grafana Labs documentation☆38Updated this week
- Sample Azure Functions app using F# and MS Cognitive Services☆12Sep 5, 2016Updated 9 years ago
- Image Galleries for Shiny and RMarkdown☆17Jul 27, 2021Updated 4 years ago
- Pytest plugin to write Playwright tests with ease. Provides fixtures to have a page instance for each individual test and helpful CLI opt…☆14Aug 3, 2020Updated 5 years ago
- Monorepo containing all bashbuddy.run code☆34Dec 24, 2025Updated 4 months ago
- Plugin Allows loading of local llms into Auto-GPT☆12Apr 21, 2023Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Maschinenlesbare Wahlprogramme der Europawahl 2019☆13May 14, 2019Updated 6 years ago
- ☆15Jan 26, 2025Updated last year
- A python tool help to interact with chatgpt.☆10Dec 11, 2022Updated 3 years ago
- Repo for QGIS Sample data (aka Alaska dataset), used in QGIS Documentation☆17Sep 14, 2022Updated 3 years ago
- Python data science and machine learning tutorial☆14Nov 5, 2023Updated 2 years ago
- Becoming 1% better at data science everyday☆15Sep 14, 2020Updated 5 years ago
- A command-line interface tool that converts natural language instructions into shell commands using OpenAI's GPT-4.☆21Mar 11, 2025Updated last year