dhruvilgala / tvtropesLinks
☆60Updated 2 years ago
Alternatives and similar repositories for tvtropes
Users that are interested in tvtropes are comparing it to the libraries listed below
Sorting:
- Search through Facebook Research's PyTorch BigGraph Wikidata-dataset with the Weaviate vector search engine☆31Updated 3 years ago
- ☆67Updated last year
- Code repo for "Most Language Models can be Poets too: An AI Writing Assistant and Constrained Text Generation Studio" at the (CAI2) wor…☆212Updated 2 years ago
- Documentation effort for the BookCorpus dataset☆34Updated 4 years ago
- A simple tool for splitting up an ebook into its chapters. Works well with Project Gutenberg texts. May also be used to clean up books fo…☆110Updated 6 years ago
- ☆168Updated last year
- Pipeline to generate the Standardized Project Gutenberg Corpus☆196Updated last year
- Libraries, Archives and Museums (LAM)☆85Updated 2 years ago
- assign color hues to a collection of text fragments based on embeddings☆20Updated last year
- ☆95Updated last year
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.☆80Updated 2 years ago
- A TextTiling-based algorithm for text segmentation (aka topic segmentation) that uses neural sentence encoders, as well as extractive sum…☆48Updated 2 years ago
- Code and data to support "Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4"☆69Updated 2 years ago
- YT_subtitles - extracts subtitles from YouTube videos to raw text for Language Model training☆43Updated 4 years ago
- 🗺️ Data Cleaning and Textual Data Visualization 🗺️☆185Updated 3 months ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆65Updated last year
- Repo for the paper "Detecting Logical Fallacies: From Quiz to Climate Change News" (2021)☆79Updated last year
- A BERT-based application for reusable text classification at scale☆38Updated 2 years ago
- A corpus and code for understanding norms and subjectivity. 🤖☆50Updated 11 months ago
- Efficient few-shot learning with cross-encoders.☆57Updated last year
- Seahorse is a dataset for multilingual, multi-faceted summarization evaluation. It consists of 96K summaries with human ratings along 6 q…☆89Updated last year
- Highly concurrent and fast content processing for Mighty Inference Server☆10Updated 2 years ago
- The AI Knowledge Editor☆185Updated 3 years ago
- Pre-train Static Word Embeddings☆84Updated 2 months ago
- 💫 SpaCy wrapper for ConceptNet 💫☆93Updated 2 years ago
- llm sampler that only allows words that are in the bible☆27Updated 8 months ago
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated last year
- Frame Semantic Parser based on T5 and FrameNet☆62Updated last year
- Semantic search engine indexing 110 million academic publications☆91Updated last month
- LLM plugin for clustering embeddings☆81Updated last year