hynky1999/CmonCrawl

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hynky1999/CmonCrawl)

hynky1999 / CmonCrawl

Common crawl extractor

☆82

Alternatives and similar repositories for CmonCrawl

Users that are interested in CmonCrawl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jjonescz / awe
View on GitHub
AI-based web extractor
☆12Feb 25, 2023Updated 3 years ago
CI-Research / KeywordAnalysis
View on GitHub
Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends
☆57Jan 28, 2024Updated 2 years ago
CitizensFoundation / pace-keyword-scanner
View on GitHub
CommonCrawl keyword scanner. Time for month of CC data on EC2 c5.18xlarge instance for hundreds of keywords takes about 3 hours. LLM (BER…
☆15Apr 1, 2023Updated 3 years ago
Mindbaz / python-gpostmaster-domains-datas
View on GitHub
Downloads and flattends datas from Google Postmaster Tools (GPT)
☆16May 26, 2026Updated last month
microsoft / Azure-Synapse-Content-Recommendations-Solution-Accelerator
View on GitHub
This is a solution accelerator for creating personalized content recommendations based on user activity.
☆13Mar 26, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
AshrafAlam / cloud-architecture-use-cases
View on GitHub
List of real world use cases where to fit different azure services.
☆15Apr 5, 2019Updated 7 years ago
TanayB11 / cosine
View on GitHub
Private semantic search for your Obsidian vault
☆12Sep 12, 2023Updated 2 years ago
coderefinery / jupyter
View on GitHub
Jupyter notebooks - A tool to write and share executable notebooks and data visualization
☆10Feb 5, 2026Updated 5 months ago
ufukhawk / XamCall
View on GitHub
XamDesign Xamarin Forms Call screen Ui Design
☆24Mar 7, 2020Updated 6 years ago
vvrahul11 / llm_chatbot
View on GitHub
Web application that allows you to interact with biomedical knowledge graphs and query biomedical questions.
☆31Sep 20, 2023Updated 2 years ago
Dicklesworthstone / fastmcp_rust
View on GitHub
Rust framework for building Model Context Protocol servers with cancel-correct async, zero-copy serialization, and first-class tool/resou…
☆29Updated this week
Asuna001 / KG-Crop
View on GitHub
玉米病虫害知识图谱问答系统
☆15Dec 14, 2023Updated 2 years ago
othr-nlp / rage_toolkit
View on GitHub
☆11Sep 27, 2024Updated last year
yxnchen / BATF
View on GitHub
MATLAB code for 「Missing traffic data imputation and pattern discovery with a Bayesian augmented tensor factorization model」.
☆15Nov 23, 2020Updated 5 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
seanchatmangpt / dslmodel
View on GitHub
Structured outputs from DSPy and Jinja2
☆27Jun 27, 2025Updated last year
Wluper / lida
View on GitHub
LIDA: Lightweight Interactive Dialogue Annotator (in EMNLP 2019)
☆10Oct 18, 2021Updated 4 years ago
jpfitzinger / tidyfit
View on GitHub
An extension to the R tidyverse for automated ML. The package allows fitting and cross validation of linear regression and classification…
☆18Apr 29, 2025Updated last year
bysj2022NB / doctor_neo4j_spark_hadoop_rec2024
View on GitHub
计算机毕业设计hadoop+spark知识图谱医生推荐系统门诊人数预测医疗数据可视化医疗大数据医疗数据分析医生爬虫大数据毕业设计大数据毕设
☆11Jun 30, 2023Updated 3 years ago
xueyouluo / biaffine-bert-relation-extract
View on GitHub
基于BERT+Biaffine结构的关系抽取模型
☆12Feb 23, 2022Updated 4 years ago
facebookresearch / lss_eval
View on GitHub
This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…
☆31Aug 25, 2023Updated 2 years ago
dpasse / extr
View on GitHub
Named Entity Recognition (NER) and Relation Extraction (RE) library using Regular Expressions
☆10Jun 2, 2023Updated 3 years ago
TheRobBrennan / fixie-ai-llm-hackathon-20230916
View on GitHub
This project explores my adventures doing a deep dive of OpenAI embeddings with Neo4j during the Fixie AI + LLM Hackathon on Saturday, Se…
☆15Sep 19, 2023Updated 2 years ago
friendly / candisc
View on GitHub
Visualizing Generalized Canonical Discriminant and Canonical Correlation Analysis
☆16Jul 1, 2026Updated 3 weeks ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
andrewheiss / supply-demand-ggplot
View on GitHub
Create supply/demand economics graphs with R and ggplot
☆11Sep 20, 2017Updated 8 years ago
wangpage / quant-ashare
View on GitHub
机构级 A股量化系统 - Hermes 多智能体 + Barra 中性化 + Level2 微结构 + 15 个圈内 tricks 完整实现
☆16Jul 17, 2026Updated last week
PyThaiNLP / MultiEL
View on GitHub
Multilingual Entity Linking model by BELA model
☆12Jul 20, 2023Updated 3 years ago
KRICT-DATA / Perov_CGCNN
View on GitHub
This is the repository of code and data for paper "Machine learning-enabled chemical space exploration of all-inorganic perovskites for p…
☆12Sep 23, 2024Updated last year
jerrywcy / Obsidian-English-Reading-Vault
View on GitHub
An Obsidian vault for English reading like lingQ.
☆12May 29, 2022Updated 4 years ago
brian-lou / Training-Data-Extraction-Attack-on-LLMs
View on GitHub
This project explores training data extraction attacks on the LLaMa 7B, GPT-2XL, and GPT-2-IMDB models to discover memorized content usin…
☆15Jun 15, 2023Updated 3 years ago
ht2459 / gov_wj
View on GitHub
☆15Mar 31, 2021Updated 5 years ago
VBot2410 / Deep-Q-Learning-Cartpole
View on GitHub
Stabilizing an Inverted Pendulum on a cart using Deep Reinforcement Learning
☆10Jul 8, 2018Updated 8 years ago
maxim5 / cs224n-2019-winter
View on GitHub
All lecture notes, slides and assignments from CS224n: Natural Language Processing with Deep Learning class by Stanford
☆12Dec 23, 2021Updated 4 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
rpahl / pipeflow
View on GitHub
A beginner-friendly framework for building fast interactive data analysis pipelines in R that scale.
☆18Updated this week
JustAnotherArchivist / little-things
View on GitHub
The little things give you away... A collection of various small helper stuff – Mirror repo only, no longer kept in sync, refer to gitea.…
☆24Sep 11, 2020Updated 5 years ago
autirahul / openAI_ChatBot
View on GitHub
Basic openAI chat Bot on neo4j knowledge graph
☆12Oct 4, 2023Updated 2 years ago
ex3ndr / glassium
View on GitHub
Mobile App for AI Wearables
☆18May 22, 2024Updated 2 years ago
timsainb / curriculum_vitae
View on GitHub
My Curriculum Vitae, generated in Python via Jinja from JSON fields into HTML. http://timsainburg.com/pages/cv.html
☆12Apr 6, 2025Updated last year
huangjie-nlp / GPLinker
View on GitHub
Chinese entity relation extraction
☆15Apr 26, 2024Updated 2 years ago
ajaysub110 / critical-band-masking
View on GitHub
Code for the NeurIPS 2023 paper "Spatial-frequency channels, shape bias, and adversarial robustness"
☆14Nov 5, 2023Updated 2 years ago