psmedia / Books3InfoLinks
Data and information related to the Books3 dataset included as part of The Pile, and used to train Meta's LLaMA among others
☆35Updated 9 months ago
Alternatives and similar repositories for Books3Info
Users that are interested in Books3Info are comparing it to the libraries listed below
Sorting:
- LLM plugin for clustering embeddings☆82Updated last year
- ☆67Updated last year
- LLM plugin providing access to Mistral models using the Mistral API☆206Updated 6 months ago
- Code for the paper: "Large Language Models as Corporate Lobbyists" (2023).☆171Updated 3 years ago
- LLM plugin for embeddings using sentence-transformers☆74Updated 9 months ago
- Tools to construct and process Common Crawl webgraphs☆105Updated last week
- Knowledge Graph Generator app☆34Updated last year
- The AI Knowledge Editor☆184Updated 3 years ago
- 🚀 Template Haystack Search Application with Streamlit☆27Updated last year
- Code and data to support "Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4"☆69Updated 2 years ago
- Small python package to measure OCR quality and other related metrics.☆26Updated last year
- LLM plugin for models hosted on Replicate☆65Updated last year
- https://verdad.app☆86Updated last week
- Some tough questions to test new models.☆28Updated last year
- 💭 Build autonomous agents, retrieval augmented generation (RAG) processes and language model powered chat applications☆333Updated 8 months ago
- A Datasette plugin that turns a Datasette instance into a ChatGPT plugin☆69Updated last year
- 🗺️ Data Cleaning and Textual Data Visualization 🗺️☆199Updated 8 months ago
- LLM plugin for running models using llama.cpp☆146Updated 2 years ago
- Tutorial and template for a semantic search app powered by the Atlas Embedding Database, Langchain, OpenAI and FastAPI☆114Updated 2 years ago
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated last year
- Bringing Generative AI to the way the Civil Service works☆134Updated last month
- A BERT-based application for reusable text classification at scale☆38Updated 2 years ago
- examples and guides to using Nomic Atlas☆37Updated 9 months ago
- ☆22Updated 2 years ago
- ☆185Updated 2 years ago
- Access the Cohere Command R family of models☆38Updated 10 months ago
- A Chrome extension that saves conversations with Claude to GitHubGists or your clipboard.☆90Updated last year
- Libraries, Archives and Museums (LAM)☆88Updated 3 years ago
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.☆81Updated 2 years ago
- A proof of concept tool for using ChatGPT to transform messy text documents into structured JSON☆122Updated last year