Package and scripts used to build a dataset of Wikipedia articles in Markdown.
☆20Sep 11, 2023Updated 2 years ago
Alternatives and similar repositories for goodwiki
Users that are interested in goodwiki are comparing it to the libraries listed below
Sorting:
- ☆26May 30, 2023Updated 2 years ago
- A library for squeakily cleaning and filtering language datasets.☆50Jul 10, 2023Updated 2 years ago
- Network Etiquette (Netiquette) -- Written with 2020 technology in mind☆10Nov 19, 2021Updated 4 years ago
- This repository defines a python class that can be used to load data for the tf.keras.model.fit_generator function by using a torch.utils…☆11Oct 26, 2024Updated last year
- ☆38Apr 17, 2024Updated last year
- The pipeline for the OSCAR corpus☆176Nov 9, 2025Updated 3 months ago
- Training a BERT model from scratch.☆11Oct 15, 2023Updated 2 years ago
- All the tools that allow me to never ever open up Final Cut☆11Feb 16, 2025Updated last year
- A simple code generator of JSON marshaler for go and tinygo.☆10Feb 9, 2026Updated 3 weeks ago
- ☆12Jan 29, 2021Updated 5 years ago
- DialCoT Meets PPO: Decomposing and Exploring Reasoning Paths in Smaller Language Models☆13Nov 2, 2023Updated 2 years ago
- Quora Paraphrasing Dataset Bahasa Indonesia Version☆11Apr 18, 2021Updated 4 years ago
- ☆13May 21, 2023Updated 2 years ago
- Moral Machine Experiment on LLMs☆11Updated this week
- Indonesian law dataset containing section annotation of court decision documents☆17Jul 7, 2022Updated 3 years ago
- Mini Model Daemon☆12Nov 9, 2024Updated last year
- A simple agent powered by LLMs that performs tasks.☆13Apr 25, 2025Updated 10 months ago
- A drag-and-drop-enabled, responsive, envelope graph that allows to shape a wave with attack, decay, sustain and release☆11Jan 5, 2023Updated 3 years ago
- ☆16Dec 21, 2023Updated 2 years ago
- Dataset and code to reproduce the results of the paper "Evolving Structures in Complex Systems"☆11Dec 16, 2019Updated 6 years ago
- A minimal implementation of spotify/annoy in pure rust☆11Mar 2, 2023Updated 3 years ago
- ☆10May 28, 2022Updated 3 years ago
- A chat implementation for FastHTML☆11Sep 14, 2025Updated 5 months ago
- Turn Trello into a CMS to power all your websites and apps.☆10May 12, 2018Updated 7 years ago
- A library for simplifying training with multi gpu setups in the HuggingFace / PyTorch ecosystem.☆16Jan 9, 2026Updated last month
- Tool to apply Legal Matter Specification Standard (LMSS) to documents☆12Aug 15, 2024Updated last year
- Unbounded cache model for online language modeling with open vocabulary☆11Feb 15, 2019Updated 7 years ago
- Vonatkésési statisztika☆20Updated this week
- benchmarks for evaluating MT models☆11Jun 26, 2024Updated last year
- Code of our paper "Method-Level Bug Severity Prediction using Source Code Metrics and LLMs" which is accepted to ISSRE 2023.☆10Nov 12, 2023Updated 2 years ago
- ☆11Sep 8, 2024Updated last year
- Telegram bot framework written in PHP for OpenWRT☆12Nov 27, 2022Updated 3 years ago
- A pipeline framework for data science projects☆10Aug 9, 2022Updated 3 years ago
- Build modern UIs in Jupyter with Python☆12Dec 28, 2022Updated 3 years ago
- This example shows how to perform quantization aware training for transfer learned MobileNet-v2 network.☆12Dec 19, 2023Updated 2 years ago
- Tied-Augment: Controlling Representation Similarity Improves Data Augmentation☆14Oct 1, 2023Updated 2 years ago
- code and data used to build a training dataset for dragnet models☆10Nov 29, 2020Updated 5 years ago
- A very easy-to-use wrapper of Duktape JavaScript engine, including wrappers for C, Go and Java. The bridge wrapper is also supporting mo…☆14Dec 20, 2021Updated 4 years ago
- Script for downloading GitHub.☆13Sep 24, 2020Updated 5 years ago