A collection of preprocessed datasets and pretrained models for generating paraphrases.
☆32Jul 13, 2021Updated 4 years ago
Alternatives and similar repositories for paraphrase-datasets-pretrained-models
Users that are interested in paraphrase-datasets-pretrained-models are comparing it to the libraries listed below
Sorting:
- Quora Paraphrasing Dataset Bahasa Indonesia Version☆11Apr 18, 2021Updated 4 years ago
- Welcome to our repository! This repository hosts the data on "IndoCollex: A Testbed for Morphological Transformation of Indonesian Word …☆23Aug 10, 2021Updated 4 years ago
- English - Indonesian parallel corpora☆17Aug 6, 2018Updated 7 years ago
- Benchmarking Multidomain English-Indonesian Machine Translation☆16Dec 19, 2020Updated 5 years ago
- One click away from a locally downloaded, fine-tuned model, hosted on hugging face, with inference built in. In two hours.☆24Nov 9, 2025Updated 4 months ago
- Dependency Parser and NER model for Bahasa Indonesia Spacy 2.1☆20Jul 17, 2020Updated 5 years ago
- CLIP (Contrastive Language–Image Pre-training) trained on Indonesian data☆19Dec 4, 2021Updated 4 years ago
- A fast RWKV Tokenizer written in Rust☆54Aug 12, 2025Updated 6 months ago
- ☆28Nov 15, 2023Updated 2 years ago
- Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More☆34May 17, 2025Updated 9 months ago
- ☆10May 25, 2021Updated 4 years ago
- Time-stretch audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included…☆40Sep 5, 2022Updated 3 years ago
- Use MobileNet SSD and openCV to detect and count car on road☆12Jan 13, 2020Updated 6 years ago
- A Python Reddit scraper with dual-mode architecture: simple requests for small jobs, async + proxy rotation for large-scale scraping. Fea…☆16Oct 30, 2025Updated 4 months ago
- Architecture of Twint scrapper which allow download tweets on many instances without api restrictions☆10Nov 30, 2020Updated 5 years ago
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- A python algorithm to change the pitch of the voice in real time☆13Dec 13, 2020Updated 5 years ago
- automated insights for tabular data☆10Feb 10, 2025Updated last year
- end-to-end information extraction pipeline built by LayoutLMV2, pretrained model from HuggingFace☆11Aug 15, 2023Updated 2 years ago
- Scrape most mentioned stock tickers from Reddit. Wallstreetbets and Wallstreetbetsnew☆12Mar 5, 2021Updated 5 years ago
- A comprehensive ELT pipeline for analyzing passenger satisfaction data. Features a modern data architecture with Apache Airflow for extra…☆12Oct 5, 2025Updated 5 months ago
- A Mechanistic‑Interpretability study that finds the structural dynamics of Large Language Models under fine‑tuning.☆16May 30, 2025Updated 9 months ago
- A comprehensive API Client for interacting with UMLS APIs including Search, Source, CUI, Semantic Network, and Crosswalk APIs.☆17Sep 18, 2024Updated last year
- Keyword extraction using Scake, KeyBERT, Fine-tuning Transformer BERT-like models and ChatGPT.☆12May 22, 2023Updated 2 years ago
- CVPR 2023: PAniC-3D, Vtubers dataset downloader☆13Apr 22, 2023Updated 2 years ago
- ☆16Jul 23, 2023Updated 2 years ago
- ☆16Sep 17, 2024Updated last year
- This repository defines a python class that can be used to load data for the tf.keras.model.fit_generator function by using a torch.utils…☆11Oct 26, 2024Updated last year
- A Python package for accessing the OpenCorporates API☆11Feb 12, 2019Updated 7 years ago
- Prevent Data Issues in Django Apps☆16Oct 9, 2024Updated last year
- Reading comprehension based question-answering model for news articles.☆11Jun 22, 2022Updated 3 years ago
- Newspaper Segmentation into images and text☆12Jan 11, 2019Updated 7 years ago
- Twitter based sentiment analysis using JAVA and Hadoop. In this project we are doing the sentiment analysis on twitter data to analyse wh…☆10Apr 22, 2018Updated 7 years ago
- (READ ONLY MIRROR) The ProB Model Checker and Animator Plugin for Rodin☆19Feb 26, 2026Updated last week
- Vietnamese GPT-J API service deployed with Docker & Helm chart☆10Dec 11, 2022Updated 3 years ago
- The first large-scale summarization corpus for the Indonesian language. AACL 2020.☆38Mar 4, 2021Updated 5 years ago
- ☆17Oct 28, 2025Updated 4 months ago
- Control LED strips wirelessly by sending them short animation programs☆12Aug 22, 2021Updated 4 years ago
- Indonesian law dataset containing section annotation of court decision documents☆17Jul 7, 2022Updated 3 years ago