leogao2 / commoncrawl_downloaderView external linksLinks
☆32May 23, 2023Updated 2 years ago
Alternatives and similar repositories for commoncrawl_downloader
Users that are interested in commoncrawl_downloader are comparing it to the libraries listed below
Sorting:
- ☆16Dec 11, 2024Updated last year
- ☆16Mar 25, 2022Updated 3 years ago
- ☆95Jul 16, 2022Updated 3 years ago
- website for MS Marco☆34Mar 26, 2025Updated 10 months ago
- Evaluation tools shared across anserini, pyserini, and pygaggle☆35Jan 28, 2026Updated 2 weeks ago
- ZYN: Zero-Shot Reward Models with Yes-No Questions☆35Aug 15, 2023Updated 2 years ago
- BERT models for many languages created from Wikipedia texts☆33May 25, 2020Updated 5 years ago
- This project showcases engaging interactions between two AI chatbots.☆10Jan 10, 2024Updated 2 years ago
- Kinematic and dynamic models of continuum and articulated soft robots.☆15Nov 22, 2025Updated 2 months ago
- ☆1,636Apr 27, 2023Updated 2 years ago
- Security research organization dedicated to finding low hanging, critical, vulnerabilities.☆15May 12, 2022Updated 3 years ago
- ☆10Jan 23, 2025Updated last year
- A fun little data analysis project to whether American prefers Mexican food over Italian food or Chinese Food.☆12Sep 11, 2017Updated 8 years ago
- Containerfile for the Vanilla OS Desktop+Nvidia image.☆16Feb 5, 2026Updated last week
- ☆11Jan 13, 2024Updated 2 years ago
- ☆53Feb 10, 2025Updated last year
- C4RepSet: Representative Subset from C4 data for Training Pre-trained LMs☆11Jan 13, 2023Updated 3 years ago
- A higher quality RVC pretrained model to accelerate your training process.☆21Nov 11, 2025Updated 3 months ago
- Fake NEWS detector using LIAR dataset.☆11Aug 19, 2019Updated 6 years ago
- DuckDuckGo Image Search Resuts - Programatically download Image Search Results☆10Jan 18, 2023Updated 3 years ago
- Code that drives the public web-based tools for the Media Cloud Online News Archive and Directory.☆11Updated this week
- ☆10Jul 6, 2023Updated 2 years ago
- Wikimedia Enterprise - client SDK in Python☆20Nov 11, 2025Updated 3 months ago
- Script for downloading GitHub.☆98Jul 1, 2024Updated last year
- A Simple, Explainable Vision Language Model for detecting manifacturing defects into products☆14Sep 23, 2025Updated 4 months ago
- full code written for the Twilio blog https://www.twilio.com/blog/media-file-storage-python-flask-amazon-s3-buckets☆11May 4, 2024Updated last year
- camera monitoring and alerts using deepstack☆13Jun 2, 2020Updated 5 years ago
- LLM Chatbot with Retrieval Augmented Generation using Llamaindex. It works both in online and offline mode.☆13Dec 8, 2023Updated 2 years ago
- Tool to help migrate your application state from one version to another easily and reliably☆11Jan 22, 2026Updated 3 weeks ago
- ☆14May 6, 2018Updated 7 years ago
- ☆10Mar 26, 2022Updated 3 years ago
- ☆12Apr 26, 2024Updated last year
- parse_mediawiki_dump clone☆11Mar 22, 2025Updated 10 months ago
- ☆11May 6, 2025Updated 9 months ago
- Discontinued virtual desktop manager for Apple’s Mac OS X 10.4 "Tiger".☆18Sep 23, 2009Updated 16 years ago
- A UI designer for constructing AI applications with OpenSearch☆16Updated this week
- TREC Core track☆11Jul 5, 2017Updated 8 years ago
- Viewer for text datasets in formats like HuggingFace, JSONL, etc.☆15Feb 25, 2025Updated 11 months ago
- Simulated user for TREC 2016-2017 Dynamic Domain track☆10Dec 27, 2017Updated 8 years ago