citiususc / pyplexity
Cleaning tool for web scraped text
☆39Updated last year
Related projects ⓘ
Alternatives and complementary repositories for pyplexity
- LLM plugin for embeddings using sentence-transformers☆42Updated 9 months ago
- Plugin for LLM adding support for Google's PaLM 2 model☆14Updated last year
- Factored Cognition Primer: How to write compositional language model programs☆48Updated last year
- Embedding models from Jina AI☆56Updated 9 months ago
- A Python micro framework for creating LLM-driven agents☆22Updated 7 months ago
- The AI assistant for Obsidian that helps you write better and think more clearly☆66Updated last year
- Granular Viewer of Sentiments Between Entities in Massively Large Documents and Collections of Texts, powered by AREkit☆37Updated 3 months ago
- assign color hues to a collection of text fragments based on embeddings☆20Updated 4 months ago
- Structured Output Is All You Need!☆48Updated 7 months ago
- LLM plugin adding support for the MPT-30B language model☆32Updated last year
- LLM plugin for clustering embeddings☆61Updated 8 months ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆42Updated 5 years ago
- Search through Facebook Research's PyTorch BigGraph Wikidata-dataset with the Weaviate vector search engine☆31Updated 2 years ago
- A CLI tool for managing OpenAI batch processing jobs with ease.☆26Updated 2 months ago
- Python code for building a GPT-3 based technical blog post optimizer.☆84Updated 2 years ago
- [Added T5 support to TRLX] A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)☆47Updated last year
- Use OpenAI Embeddings to visualize Kindle Highlights from Readwise!☆28Updated 5 months ago
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around t…☆32Updated last year
- Documentation effort for the BookCorpus dataset☆31Updated 3 years ago
- Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.☆44Updated 5 months ago
- Datasette enrichment for analyzing row data using OpenAI's GPT models☆19Updated 5 months ago
- YouTube Transcript Cleaner is a simple web-based application that improves the readability of YouTube transcripts.☆22Updated last year
- utilities for loading and running text embeddings with onnx☆39Updated 3 months ago
- Run embedding models using ONNX☆23Updated 9 months ago
- A Collection of Awesome Personal Search Engines and Related Projects☆17Updated last year
- Query databases and tables with AI assistance☆16Updated 6 months ago
- RAG for any docs hosted on readthedocs☆33Updated 10 months ago
- Search COVID-19 Open Research Dataset (CORD-19) using Vespa - the open source big data serving engine.☆37Updated last week
- An Infr app that automates data collection from your PC, macOS or Linux client.☆11Updated last year