thoppe / The-Pile-FreeLawLinks
Download, parse, and filter data from Court Listener, part of the FreeLaw projects. Data-ready for The-Pile.
☆13Updated 2 years ago
Alternatives and similar repositories for The-Pile-FreeLaw
Users that are interested in The-Pile-FreeLaw are comparing it to the libraries listed below
Sorting:
- Python tools for processing the stackexchange data dumps into a text dataset for Language Models☆84Updated last year
- Small and Efficient Mathematical Reasoning LLMs☆72Updated last year
- Pre-training code for CrystalCoder 7B LLM☆55Updated last year
- YT_subtitles - extracts subtitles from YouTube videos to raw text for Language Model training☆44Updated 5 years ago
- This repository contains all the code for collecting large scale amounts of code from GitHub.☆110Updated 2 years ago
- ☆31Updated last year
- ☆16Updated 5 months ago
- A multimodal RAG application that enables semantic search on multimedia sources like audio, video and images☆42Updated last year
- Script for downloading GitHub.☆97Updated last year
- Downloads 2020 English Wikipedia articles as plaintext☆25Updated 2 years ago
- LLM finetuning☆42Updated 2 years ago
- 👩🤝🤖 A curated list of datasets for large language models (LLMs), RLHF and related resources (continually updated)☆24Updated 2 years ago
- ☆56Updated 3 months ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆33Updated 5 months ago
- ☆91Updated 3 years ago
- ☆157Updated 4 years ago
- Official code for ACL 2023 (short, findings) paper "Recursion of Thought: A Divide and Conquer Approach to Multi-Context Reasoning with L…☆45Updated 2 years ago
- ToK aka Tree of Knowledge for Large Language Models LLM. It's a novel dataset that inspires knowledge symbolic correlation in simple inpu…☆54Updated 2 years ago
- Mixing Language Models with Self-Verification and Meta-Verification☆110Updated 9 months ago
- Ongoing research training transformer models at scale☆31Updated last week
- AI Evaluation Platform☆46Updated 4 months ago
- A set of utilities for running few-shot prompting experiments on large-language models☆122Updated last year
- Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."☆64Updated 2 years ago
- Experimental sampler to make LLMs more creative☆31Updated 2 years ago
- Lightweight tools for quick and easy LLM demo's☆28Updated last year
- ☆26Updated last year
- This project studies the performance and robustness of language models and task-adaptation methods.☆153Updated last year
- Plug-and-play Search Interfaces with Pyserini and Hugging Face☆32Updated 2 years ago
- ☆85Updated 2 years ago
- An Implementation of "Orca: Progressive Learning from Complex Explanation Traces of GPT-4"☆43Updated 11 months ago