thoppe / The-Pile-FreeLawLinks
Download, parse, and filter data from Court Listener, part of the FreeLaw projects. Data-ready for The-Pile.
☆13Updated 2 years ago
Alternatives and similar repositories for The-Pile-FreeLaw
Users that are interested in The-Pile-FreeLaw are comparing it to the libraries listed below
Sorting:
- Python tools for processing the stackexchange data dumps into a text dataset for Language Models☆83Updated last year
- LLM finetuning☆43Updated 2 years ago
- Pre-training code for CrystalCoder 7B LLM☆55Updated last year
- A multimodal RAG application that enables semantic search on multimedia sources like audio, video and images☆40Updated last year
- LLM plugin for clustering embeddings☆81Updated last year
- ☆56Updated 2 months ago
- Tools for formatting large language model prompts.☆13Updated last year
- Downloads 2020 English Wikipedia articles as plaintext☆25Updated 2 years ago
- This repository contains all the code for collecting large scale amounts of code from GitHub.☆110Updated 2 years ago
- A forest of autonomous agents.☆19Updated 7 months ago
- Small and Efficient Mathematical Reasoning LLMs☆71Updated last year
- A set of utilities for running few-shot prompting experiments on large-language models☆122Updated last year
- ☆31Updated last year
- Public reports detailing responses to sets of prompts by Large Language Models.☆31Updated 7 months ago
- 👩🤝🤖 A curated list of datasets for large language models (LLMs), RLHF and related resources (continually updated)☆23Updated 2 years ago
- Scrape and export data from the Open LLM Leaderboard.☆45Updated 8 months ago
- ☆16Updated 4 months ago
- Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."☆64Updated 2 years ago
- ☆29Updated last year
- Python examples using the bigcode/tiny_starcoder_py 159M model to generate code☆45Updated 2 years ago
- DSPy program/pipeline inspector widget for Jupyter/VSCode Notebooks.☆37Updated last year
- Small, simple agent task environments for training and evaluation☆18Updated 10 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆67Updated 9 months ago
- Source codes for the paper "Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints"☆27Updated 2 years ago
- Run SWE-bench evaluations remotely☆40Updated 2 weeks ago
- An Implementation of "Orca: Progressive Learning from Complex Explanation Traces of GPT-4"☆43Updated 10 months ago
- ☆20Updated last year
- ToK aka Tree of Knowledge for Large Language Models LLM. It's a novel dataset that inspires knowledge symbolic correlation in simple inpu…☆54Updated 2 years ago
- ☆154Updated 4 years ago
- Aana SDK is a powerful framework for building AI enabled multimodal applications.☆52Updated last week