dhruvilgala / tvtropes
☆54Updated last year
Alternatives and similar repositories for tvtropes:
Users that are interested in tvtropes are comparing it to the libraries listed below
- Search through Facebook Research's PyTorch BigGraph Wikidata-dataset with the Weaviate vector search engine☆31Updated 3 years ago
- Documentation effort for the BookCorpus dataset☆33Updated 3 years ago
- Code repo for "Most Language Models can be Poets too: An AI Writing Assistant and Constrained Text Generation Studio" at the (CAI2) wor…☆210Updated last year
- YT_subtitles - extracts subtitles from YouTube videos to raw text for Language Model training☆39Updated 4 years ago
- SFGram (Science-Fiction Gram) is a dataset of public science-fiction novels, books and movie covers. It is designed to be used by researc…☆30Updated 6 years ago
- An experiment replicating part of "Why Literary Time is Measured in Minutes" with GPT-4.☆32Updated last year
- Factored Cognition Primer: How to write compositional language model programs☆48Updated last year
- ☆67Updated 10 months ago
- Discourse Analysis Tool Suite☆18Updated this week
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.☆76Updated last year
- LLM plugin for clustering embeddings☆65Updated 10 months ago
- A simple tool for splitting up an ebook into its chapters. Works well with Project Gutenberg texts. May also be used to clean up books fo…☆103Updated 6 years ago
- A BERT-based application for reusable text classification at scale☆37Updated last year
- Training & Implementation of chatbots leveraging GPT-like architecture with the aitextgen package to enable dynamic conversations.☆48Updated 2 years ago
- Semantically Structured Sentence Embeddings☆66Updated 3 months ago
- Vespa application making an index of the CORD-19 dataset.☆39Updated 2 months ago
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated 10 months ago
- ☆153Updated 7 months ago
- Mining Legal Arguments in Court Decisions - Data and software☆65Updated last year
- ☆90Updated 7 months ago
- 💫 SpaCy wrapper for ConceptNet 💫☆89Updated last year
- Seahorse is a dataset for multilingual, multi-faceted summarization evaluation. It consists of 96K summaries with human ratings along 6 q…☆86Updated 10 months ago
- A corpus of poetry from Project Gutenberg☆194Updated 6 years ago
- assign color hues to a collection of text fragments based on embeddings☆20Updated 7 months ago
- Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further lang…☆119Updated 8 months ago
- A TextTiling-based algorithm for text segmentation (aka topic segmentation) that uses neural sentence encoders, as well as extractive sum…☆44Updated last year
- Libraries, Archives and Museums (LAM)☆82Updated 2 years ago
- Probabilistic LLM evaluations. [CogSci2023; ACL2023]☆72Updated 5 months ago
- Repo for the LREC 2022 paper The Project Dialogism Novel Corpus: A Dataset for Quotation Attribution in Literary Texts.☆13Updated 2 years ago
- Source code and data for Like a Good Nearest Neighbor☆28Updated last week