dhruvilgala / tvtropesLinks
☆60Updated 2 years ago
Alternatives and similar repositories for tvtropes
Users that are interested in tvtropes are comparing it to the libraries listed below
Sorting:
- Search through Facebook Research's PyTorch BigGraph Wikidata-dataset with the Weaviate vector search engine☆31Updated 3 years ago
- A simple tool for splitting up an ebook into its chapters. Works well with Project Gutenberg texts. May also be used to clean up books fo…☆109Updated 6 years ago
- YT_subtitles - extracts subtitles from YouTube videos to raw text for Language Model training☆43Updated 4 years ago
- A TextTiling-based algorithm for text segmentation (aka topic segmentation) that uses neural sentence encoders, as well as extractive sum…☆47Updated 2 years ago
- ☆67Updated last year
- Documentation effort for the BookCorpus dataset☆34Updated 4 years ago
- Libraries, Archives and Museums (LAM)☆84Updated 2 years ago
- Pipeline to generate the Standardized Project Gutenberg Corpus☆189Updated last year
- Code and data to support "Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4"☆69Updated 2 years ago
- Code repo for "Most Language Models can be Poets too: An AI Writing Assistant and Constrained Text Generation Studio" at the (CAI2) wor…☆213Updated last year
- 💫 SpaCy wrapper for ConceptNet 💫☆94Updated last year
- Repo for the paper "Detecting Logical Fallacies: From Quiz to Climate Change News" (2021)☆78Updated last year
- A BERT-based application for reusable text classification at scale☆38Updated last year
- RaKUn 2.0 - A fast keyword detection algorithm☆67Updated 3 months ago
- ☆165Updated last year
- The AI Knowledge Editor☆184Updated 3 years ago
- Code for collecting, processing, and preparing datasets for the Common Pile☆180Updated last month
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.☆79Updated last year
- Code for SaGe subword tokenizer (EACL 2023)☆25Updated 7 months ago
- Create soft prompts for fairseq 13B dense, GPT-J-6B and GPT-Neo-2.7B for free in a Google Colab TPU instance☆28Updated 2 years ago
- llm sampler that only allows words that are in the bible☆27Updated 7 months ago
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2…☆68Updated 2 years ago
- An open-source replication and extension of the Meta AI's LLAMA dataset☆24Updated 2 years ago
- 🗺️ Data Cleaning and Textual Data Visualization 🗺️☆179Updated last month
- Mining Legal Arguments in Court Decisions - Data and software☆68Updated 2 years ago
- ☆94Updated last year
- An experiment replicating part of "Why Literary Time is Measured in Minutes" with GPT-4.☆34Updated 2 years ago
- Completion After Prompt Probability. Make your LLM make a choice☆79Updated 8 months ago
- Semantic search engine indexing 110 million academic publications☆90Updated last week
- Multilingual syllable annotation pipeline component for spacy☆39Updated 2 years ago