thoppe / The-Pile-FreeLawLinks
Download, parse, and filter data from Court Listener, part of the FreeLaw projects. Data-ready for The-Pile.
☆14Updated 2 years ago
Alternatives and similar repositories for The-Pile-FreeLaw
Users that are interested in The-Pile-FreeLaw are comparing it to the libraries listed below
Sorting:
- Python tools for processing the stackexchange data dumps into a text dataset for Language Models☆83Updated last year
- Small and Efficient Mathematical Reasoning LLMs☆72Updated last year
- Statistics of Common Crawl monthly archives mined from URL index files☆199Updated this week
- A multimodal RAG application that enables semantic search on multimedia sources like audio, video and images☆41Updated 2 years ago
- This repository contains all the code for collecting large scale amounts of code from GitHub.☆109Updated 2 years ago
- ToK aka Tree of Knowledge for Large Language Models LLM. It's a novel dataset that inspires knowledge symbolic correlation in simple inpu…☆54Updated 2 years ago
- ☆92Updated 3 years ago
- Downloads 2020 English Wikipedia articles as plaintext☆24Updated 2 years ago
- Tools to construct and process Common Crawl webgraphs☆101Updated last week
- ☆31Updated last year
- YT_subtitles - extracts subtitles from YouTube videos to raw text for Language Model training☆45Updated 5 years ago
- Script for downloading GitHub.☆97Updated last year
- LLM plugin for clustering embeddings☆82Updated last year
- Pre-training code for CrystalCoder 7B LLM☆55Updated last year
- LLM finetuning☆41Updated 2 years ago
- ☆17Updated 7 months ago
- Code for constructing TLDR corpus from Reddit dataset☆26Updated 3 years ago
- Mixing Language Models with Self-Verification and Meta-Verification☆109Updated 11 months ago
- Based on the tree of thoughts paper☆48Updated 2 years ago
- ☆159Updated 4 years ago
- DSPy program/pipeline inspector widget for Jupyter/VSCode Notebooks.☆42Updated last year
- An attribution library for LLMs☆46Updated last year
- 🔔🧠 Easily experiment with popular language agents across diverse reasoning/decision-making benchmarks!☆53Updated 4 months ago
- Python examples using the bigcode/tiny_starcoder_py 159M model to generate code☆45Updated 2 years ago
- ☆26Updated last year
- ☆58Updated last year
- A set of utilities for running few-shot prompting experiments on large-language models☆126Updated 2 years ago
- ☆20Updated 2 years ago
- ☆32Updated 2 years ago
- Small, simple agent task environments for training and evaluation☆19Updated last year