thoppe / The-Pile-FreeLawLinks
Download, parse, and filter data from Court Listener, part of the FreeLaw projects. Data-ready for The-Pile.
☆12Updated 2 years ago
Alternatives and similar repositories for The-Pile-FreeLaw
Users that are interested in The-Pile-FreeLaw are comparing it to the libraries listed below
Sorting:
- ☆30Updated last year
- Pre-training code for CrystalCoder 7B LLM☆55Updated last year
- Python tools for processing the stackexchange data dumps into a text dataset for Language Models☆83Updated last year
- Small and Efficient Mathematical Reasoning LLMs☆71Updated last year
- Downloads 2020 English Wikipedia articles as plaintext☆25Updated 2 years ago
- Public reports detailing responses to sets of prompts by Large Language Models.☆31Updated 7 months ago
- ☆90Updated 3 years ago
- A multimodal RAG application that enables semantic search on multimedia sources like audio, video and images☆40Updated last year
- Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."☆64Updated 2 years ago
- ☆49Updated 6 months ago
- Official repo for EMNLP 2023 paper "Explain-then-Translate: An Analysis on Improving Program Translation with Self-generated Explanations…☆29Updated last year
- ☆15Updated 4 months ago
- ☆57Updated 10 months ago
- ☆53Updated 9 months ago
- Data preparation code for Amber 7B LLM☆91Updated last year
- ☆37Updated last year
- An Implementation of "Orca: Progressive Learning from Complex Explanation Traces of GPT-4"☆43Updated 9 months ago
- ☆40Updated 7 months ago
- Data preparation code for CrystalCoder 7B LLM☆45Updated last year
- Voyage AI Official Python Library☆66Updated 2 weeks ago
- This repository contains all the code for collecting large scale amounts of code from GitHub.☆110Updated 2 years ago
- LLM finetuning☆42Updated 2 years ago
- Run SWE-bench evaluations remotely☆37Updated last week
- Multi-agent workflows and complex Agent interactions, both via YAML manifest and programmatic usage. Pydantic-AI and LiteLLM backends. Hu…☆23Updated last week
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆66Updated 9 months ago
- ☆64Updated last month
- Supervised instruction finetuning for LLM with HF trainer and Deepspeed☆35Updated 2 years ago
- ☆56Updated last month
- An attribution library for LLMs☆42Updated 10 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆91Updated 6 months ago