Most common sentences and words for all languages in the OpenSubtitles2018 corpus with Python code
β61Feb 8, 2025Updated last year
Alternatives and similar repositories for top-open-subtitles-sentences
Users that are interested in top-open-subtitles-sentences are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Simple word to frequency mappings for the german language based on text corpora and using CISTEM stemmer.β14Apr 3, 2021Updated 5 years ago
- π β’ 5050 most frequent words in 109 languagesβ51Dec 8, 2022Updated 3 years ago
- temporary files created by opensubtitles-scraperβ17Feb 3, 2026Updated 2 months ago
- NGRAMS is a search engine for the Google Books Ngram Dataset. This repository contains documentation, discussions, announcements, and issβ¦β24Dec 31, 2025Updated 3 months ago
- A C vector library similar to the C++ STL vectorβ23Apr 20, 2025Updated 11 months ago
- Managed Database hosting by DigitalOcean β’ AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- xinput2 touch input examplesβ13Aug 24, 2013Updated 12 years ago
- ScrapeAW is a framework that without API scrape IPs across the world using Shodanβ11May 16, 2024Updated last year
- Word/n-gram frequency lists for the Google Books Ngram Corpus (v3, all languages) with Python codeβ104Aug 14, 2023Updated 2 years ago
- β10Jan 16, 2024Updated 2 years ago
- Markdown Templates for Offensive Security OSCP, OSWE, OSCE, OSEE, OSWP exam reportβ10Nov 28, 2024Updated last year
- Cross platform local yomichan/yomitan server to play audio (without Anki)β12Nov 16, 2025Updated 4 months ago
- β11Jan 9, 2020Updated 6 years ago
- An isolated environment for DNS cache poisoning attack investigation and demonstration.β10Nov 22, 2020Updated 5 years ago
- A collection of fun and interesting words in English used in the Insanity Jam's Game Idea Generatorβ13Sep 8, 2022Updated 3 years ago
- DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Material for the Text Analysis of Arabic course taught at the NYU Abu Dhabi Winter Institute in Digital Humanities 2020.β15Jan 30, 2020Updated 6 years ago
- Utilities for Pentesting with BloodHoundβ23Updated this week
- Creating super-parallel corpora of more than 1500+ unique languages for NLP researchβ34Dec 8, 2022Updated 3 years ago
- Deepspeech/Coqui AI speech to text systems in Esperanto. - Parolrekoniloj en Esperanto uzante Deepspeech/Coqui Ai.β10Jan 11, 2022Updated 4 years ago
- β11Apr 20, 2023Updated 2 years ago
- CFD case for simulation of RD107 rocket engineβ15Sep 17, 2025Updated 6 months ago
- β10Mar 6, 2026Updated last month
- A collection of captured SSH login credentialsβ17Mar 28, 2021Updated 5 years ago
- World CIDR IP listsβ10Jan 28, 2026Updated 2 months ago
- Wordpress hosting with auto-scaling on Cloudways β’ AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- This script allow to scrape shodan.io IoT search engine and get devices IP without using your search or download credit!β12May 26, 2021Updated 4 years ago
- Hackers Don't Give A Shitβ16Feb 2, 2020Updated 6 years ago
- Hebrew Diacritizerβ49Mar 26, 2026Updated 2 weeks ago
- A BugBounty playbook covering vulnerability bypasses, payloads, and quick checks for OWASP Top 10 + extras.β22Sep 29, 2025Updated 6 months ago
- Khmer Character Specificationβ27Mar 14, 2025Updated last year
- CLI tool for discovering related base domains using WhoisXMLAPI's reverse Whois endpointsβ12Jun 15, 2024Updated last year
- β20Aug 3, 2022Updated 3 years ago
- A complete end-to-end Deep Learning system to generate high quality human like speech in English for Korean Drama (WIP)β13Sep 17, 2022Updated 3 years ago
- Custom Trickest Workflowsβ12Oct 26, 2023Updated 2 years ago
- DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A simple, open source, self-hosted todo manager.β20Sep 18, 2023Updated 2 years ago
- generates unique subdomain names and runs httpx on themβ18Apr 8, 2024Updated 2 years ago
- 6-DOF nonlinear dynamic model (primarily for aircraft)β10Nov 16, 2021Updated 4 years ago
- Workshop 8 - Generalized additive models (GAMs)β14Sep 3, 2024Updated last year
- Tools for calculating psycholinguistically-relevant metrics of language statistics using transformer language modelsβ12Nov 11, 2022Updated 3 years ago
- Visual Hash for matching copies of visually similar images.β16Mar 17, 2025Updated last year
- Unveiling Cyber Threats: From assets to Vulnerability Insightsβ17Oct 22, 2024Updated last year