☆32May 23, 2023Updated 2 years ago
Alternatives and similar repositories for commoncrawl_downloader
Users that are interested in commoncrawl_downloader are comparing it to the libraries listed below
Sorting:
- ☆16Mar 25, 2022Updated 3 years ago
- Source code to "SliTraNet: Automatic Detection of Slide Transitions in Lecture Videos using Convolutional Neural Networks"☆10Dec 17, 2023Updated 2 years ago
- website for MS Marco☆34Mar 26, 2025Updated 11 months ago
- ☆78Dec 7, 2023Updated 2 years ago
- mReasoner is a unified computational implementation of the model theory of thinking and reasoning☆13Aug 17, 2023Updated 2 years ago
- This project showcases engaging interactions between two AI chatbots.☆10Jan 10, 2024Updated 2 years ago
- ☆1,636Apr 27, 2023Updated 2 years ago
- ☆11Jan 13, 2024Updated 2 years ago
- A higher quality RVC pretrained model to accelerate your training process.☆21Nov 11, 2025Updated 3 months ago
- ☆15Sep 7, 2025Updated 6 months ago
- C4RepSet: Representative Subset from C4 data for Training Pre-trained LMs☆11Jan 13, 2023Updated 3 years ago
- ☆53Feb 10, 2025Updated last year
- ☆10Jul 6, 2023Updated 2 years ago
- Wikimedia Enterprise - client SDK in Python☆20Nov 11, 2025Updated 3 months ago
- A Simple, Explainable Vision Language Model for detecting manifacturing defects into products☆14Sep 23, 2025Updated 5 months ago
- Code and data for the Walert large language model-based chatbot☆12Aug 14, 2025Updated 6 months ago
- A Gentle Introduction to RAG☆15Oct 8, 2024Updated last year
- Fake NEWS detector using LIAR dataset.☆11Aug 19, 2019Updated 6 years ago
- ☆10Jan 23, 2025Updated last year
- Script for downloading GitHub.☆98Jul 1, 2024Updated last year
- Containerfile for the Vanilla OS Desktop+Nvidia image.☆16Mar 1, 2026Updated last week
- Security research organization dedicated to finding low hanging, critical, vulnerabilities.☆15May 12, 2022Updated 3 years ago
- [CIKM 2023 Oral] This is the code repo for our CIKM‘23 paper "Text Matching Improves Sequential Recommendation by Reducing Popularity Bia…☆40Mar 17, 2024Updated last year
- Automate some common Turo work that I have to do manually☆11Aug 1, 2018Updated 7 years ago
- CERN Library integrated library system.☆14Feb 26, 2026Updated last week
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Aug 12, 2023Updated 2 years ago
- Quora Paraphrasing Dataset Bahasa Indonesia Version☆11Apr 18, 2021Updated 4 years ago
- Via Text Density Simple Web Crawler With Go☆13Mar 19, 2023Updated 2 years ago
- Synthetic Data Generation with Execution-Based Verification and Grounding for LLM Training.☆19Feb 7, 2025Updated last year
- Fair Benchmarks☆10Mar 14, 2019Updated 6 years ago
- Dataset from Tip of the Tongue Known-Item Retrieval (2021) paper.☆12Nov 4, 2021Updated 4 years ago
- prevent XSS attacks by sanitizing html (this is different then escaping!)☆22Oct 14, 2023Updated 2 years ago
- Active Response plugin. Osquery to execute wazuh/ossec active response plugins. You can write your own plugins, easy to plug☆11Jun 20, 2020Updated 5 years ago
- ☆10May 28, 2022Updated 3 years ago
- Indonesian law dataset containing section annotation of court decision documents☆17Jul 7, 2022Updated 3 years ago
- ☆11Dec 9, 2020Updated 5 years ago
- Summaries of findings from Augurs audits☆11Jun 28, 2018Updated 7 years ago
- Python and C++ library to process both experimental and simulation data of colloidal particles.☆15Sep 2, 2021Updated 4 years ago
- ☆10Jan 5, 2022Updated 4 years ago