gr33ndata/dmoz-urlclassifier

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/gr33ndata/dmoz-urlclassifier)

gr33ndata / dmoz-urlclassifier

Preparing DMOZ dataset for my n-Gram LM-based URL classification research

☆31

Alternatives and similar repositories for dmoz-urlclassifier

Users that are interested in dmoz-urlclassifier are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

cjdd3b / pairwise-mapreduce
View on GitHub
Implementation of a pairwise document similarity algorithm using MapReduce.
☆15Nov 16, 2011Updated 14 years ago
kahliloppenheimer / Web-page-classification
View on GitHub
Classifies webpages into categories defined in DMOZ dataset
☆39Dec 14, 2015Updated 10 years ago
TeamHG-Memex / url-summary
View on GitHub
Show summary of a large number of URLs in a Jupyter Notebook
☆19Apr 8, 2026Updated 3 months ago
unixpickle / rwa
View on GitHub
RWA recurrent neural networks
☆17Apr 14, 2017Updated 9 years ago
wwoods / keras_pickle_wrapper
View on GitHub
A small library that wraps Keras models to pickle them.
☆14Jul 17, 2018Updated 8 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
arne-cl / ppi_graphkernel
View on GitHub
all-paths graph kernel for protein-protein interaction extraction
☆12Apr 22, 2014Updated 12 years ago
gr33ndata / irlib
View on GitHub
Information Retrieval Library (in Python)
☆82Dec 18, 2021Updated 4 years ago
scrapinghub / autopager
View on GitHub
Detect and classify pagination links
☆15Sep 9, 2020Updated 5 years ago
alexhallam / TensorFlow-Survival-Analysis
View on GitHub
Making survival analysis work in TensorFlow
☆19Jun 4, 2017Updated 9 years ago
chrisPiemonte / url2vec
View on GitHub
Graph clustering and Node embeddings with word2vec
☆14Mar 2, 2019Updated 7 years ago
lapras-inc / disk-embedding
View on GitHub
Hyperbolic Disk Embeddings for Directed Acyclic Graphs (ICML 2019)
☆20May 13, 2019Updated 7 years ago
shritesh / brainfuck-rs-wasm
View on GitHub
A Brainfuck interpreter written in Rust and compiled to WebAssembly
☆10Dec 4, 2017Updated 8 years ago
AnthonyMRios / adversarial-relation-classification
View on GitHub
Unsupervised domain adaptation method for relation extraction
☆18Jul 16, 2018Updated 8 years ago
subsetpark / ntypes
View on GitHub
A wrapper around Python's ctypes for Nim-specific function signatures.
☆12Dec 12, 2017Updated 8 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
daremon / urlclustering
View on GitHub
Package to facilitate URL clustering
☆71Feb 24, 2016Updated 10 years ago
lestrrat-go / urlenc
View on GitHub
Marshal/Unmarshal interface for structs that can encode/decode themselves to URL query strings
☆11Jun 6, 2018Updated 8 years ago
sid321axn / Detection_of_Malicious_URLs
View on GitHub
In this project, we have detected the malicious URLs using lexical features and boosted machine learning algorithms
☆20Aug 19, 2020Updated 5 years ago
allenai / label_rationale_association
View on GitHub
Code for EMNLP 2021 paper "Measuring Association Between Labels and Free-Text Rationales"
☆12Sep 12, 2023Updated 2 years ago
TeamHG-Memex / extract-html-diff
View on GitHub
extract difference between two html pages
☆33Apr 8, 2026Updated 3 months ago
chop-dbhi / drug_word_embeddings
View on GitHub
development and intrinsic evaluation of drug related word embeddings using PubMed abstracts and DrugBank
☆14Feb 3, 2017Updated 9 years ago
NetManAIOps / DOMI_code
View on GitHub
code for DOMI
☆12Mar 24, 2023Updated 3 years ago
outlace / Gridworld
View on GitHub
Simple implementation of text-based Gridworld game. Intended for use with reinforcement learning algorithms.
☆15Apr 29, 2018Updated 8 years ago
songweige / Dmoz-Dataset
View on GitHub
content.rdf.u8.gz
☆11Dec 15, 2020Updated 5 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
PapenfussLab / HaveYouSwappedYourSamples
View on GitHub
This project contains simple methods to measure sample relatedness and identify potential swaps and contamination
☆10Jan 8, 2016Updated 10 years ago
voider1 / hyperdav
View on GitHub
WebDAV client for Rust
☆10Jun 6, 2018Updated 8 years ago
dstein64 / pyfms
View on GitHub
A Theano-based Python implementation of Factorization Machines (Rendle 2010).
☆26Dec 13, 2022Updated 3 years ago
vishal1796 / pytorch-fast-neural-style
View on GitHub
Fast Style Transfer in Pytorch
☆10Mar 1, 2017Updated 9 years ago
computationalmodelling / python-package-template
View on GitHub
Attempt to provide a good-practice template for Python packages
☆11May 13, 2016Updated 10 years ago
torch / sys
View on GitHub
A system utility package for Torch.
☆13Dec 22, 2017Updated 8 years ago
xtannier / WebAnnotator
View on GitHub
WebAnnotator is a tool for annotating Web pages. WebAnnotator is implemented as a Firefox extension (https://addons.mozilla.org/en-US/fi…
☆48Dec 17, 2021Updated 4 years ago
JoshuaLelon / Pytorch-Tutorial
View on GitHub
I had a lot of questions as I went through the Deep Learning Blitz tutorial from pytorch.org, so I made my own tutorial trying to answer …
☆12Jun 16, 2018Updated 8 years ago
skytreader / CleverAlgorithms-Python
View on GitHub
The Clever Algorithms project is an effort to describe a large number of algorithmic techniques from the field of Artificial Intelligence…
☆29Oct 28, 2018Updated 7 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
shreyakupadhyay / AutoSignUp
View on GitHub
Automatic the authentication of various websites and scraping them by using mail servers
☆10May 3, 2017Updated 9 years ago
rayg1234 / pytlib
View on GitHub
A pytorch framework for building neurals networks for visual recognition, encoding, and detection tasks. The goal is to bridge the gap be…
☆10Dec 20, 2019Updated 6 years ago
landaal-ict / eigenrouter
View on GitHub
Setup KPN On Routers
☆10Dec 23, 2022Updated 3 years ago
maka89 / noisy-gp
View on GitHub
Gaussian Process Regression for training data with noisy inputs and/or outputs
☆10Mar 22, 2017Updated 9 years ago
beoite / GodotRuntimeInspector
View on GitHub
A runtime inspector for the godot game engine.
☆12Sep 6, 2025Updated 10 months ago
pathology-dynamics / biomedical-entity-linking
View on GitHub
☆26Mar 17, 2026Updated 4 months ago
felipelouza / sacak-lcp
View on GitHub
Optimal suffix sorting and LCP array construction for constant alphabets [IPL 2017]
☆10Aug 17, 2018Updated 7 years ago