citiususc / pyplexity
Cleaning tool for web scraped text
☆39Updated last year
Alternatives and similar repositories for pyplexity:
Users that are interested in pyplexity are comparing it to the libraries listed below
- Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.☆46Updated 8 months ago
- Search through Facebook Research's PyTorch BigGraph Wikidata-dataset with the Weaviate vector search engine☆31Updated 3 years ago
- Embedding models from Jina AI☆58Updated last year
- LLM plugin for clustering embeddings☆68Updated 11 months ago
- Factored Cognition Primer: How to write compositional language model programs☆48Updated last year
- A News Article Collection Library☆22Updated last year
- Generate a SQLite database from Wikipedia & Wikidata dumps.☆33Updated 10 months ago
- A Python micro framework for creating LLM-driven agents☆24Updated 10 months ago
- RAG for any docs hosted on readthedocs☆38Updated last year
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- ☆26Updated 5 months ago
- LLM plugin for embeddings using sentence-transformers☆49Updated last week
- Fastlaw's purpose is to replace generic word embeddings for work on supervised machine learning NLP-tasks with legal texts.☆37Updated 5 years ago
- Plugin for LLM adding support for Anthropic's Claude models☆36Updated 3 months ago
- Completion After Prompt Probability. Make your LLM make a choice☆74Updated 3 months ago
- ☆29Updated last year
- Granular Viewer of Sentiments Between Entities in Massively Large Documents and Collections of Texts, powered by AREkit☆37Updated last month
- Efficient few-shot learning with cross-encoders.☆48Updated last year
- A clone of OpenAI's Tokenizer page for HuggingFace Models☆44Updated last year
- Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.☆44Updated 9 months ago
- This is a proof-of-concept of using an LLM to find and extract meaningful data without parsing the html too much.☆29Updated last year
- Structured Output Is All You Need!☆53Updated 11 months ago
- A visual tool to interpret and understand PyTorch machine learning models☆16Updated last year
- An Infr app that automates data collection from your PC, macOS or Linux client.☆11Updated last year
- [Added T5 support to TRLX] A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)☆47Updated 2 years ago
- Documentation effort for the BookCorpus dataset☆33Updated 3 years ago
- 💫 SpaCy wrapper for ConceptNet 💫☆89Updated last year
- Annotation meets Large Language Models (ChatGPT, GPT-3 and alike).☆56Updated last year
- spaCy entry points for Curated Transformers☆26Updated 4 months ago