mariosasko/datasets_sql

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/mariosasko/datasets_sql)

mariosasko / datasets_sql

Execute arbitrary SQL queries on 🤗 Datasets

☆32

Alternatives and similar repositories for datasets_sql

Users that are interested in datasets_sql are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jsenellart / papers
View on GitHub
This repo is containing notes and implementations for cherry-picked publications of my particular interest
☆12May 14, 2020Updated 6 years ago
nateraw / spaces-docker-templates
View on GitHub
🚀🤗 A collection of templates for Hugging Face Spaces
☆35Oct 9, 2023Updated 2 years ago
ayaka14732 / basehangul-online
View on GitHub
Online BaseHangul Encoder And Decoder
☆13Jan 30, 2023Updated 3 years ago
robgon-art / CLIPandPASTE
View on GitHub
CLIP and PASTE: Using AI to Create Photo Collages from Text Prompts
☆28Jun 11, 2022Updated 4 years ago
morganmcg1 / wandb_spectrogram
View on GitHub
☆15Sep 24, 2022Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Davisy / Detect-and-Translate-Text-Data
View on GitHub
How to detect language and translate text data into the language of your choice when working on a NLP project
☆11Jan 13, 2021Updated 5 years ago
ineelhere / curated
View on GitHub
An open source project built with Streamlit on Python, that focuses on curating awesome resources for learning awesome skills.
☆14Jun 16, 2024Updated 2 years ago
lightonai / ducksearch
View on GitHub
Efficient BM25 with DuckDB 🦆
☆68Dec 20, 2024Updated last year
nateraw / huggingface-datasets-converter
View on GitHub
Scripts to convert datasets from various sources to Hugging Face Datasets.
☆57Oct 26, 2022Updated 3 years ago
davanstrien / haiku-dpo
View on GitHub
Using open source LLMs to build synthetic datasets for direct preference optimization
☆72Feb 29, 2024Updated 2 years ago
Foroozani / BigData_PySpark
View on GitHub
Handle Big Data for Machine Learning using Python and PySpark, Building ETL Pipelines with PySpark, MongoDB, and Bokeh
☆10Nov 12, 2021Updated 4 years ago
huggingface / trl-jobs
View on GitHub
Train LLM on Hugging Face infra
☆72May 26, 2026Updated last month
johannwyh / StyleInV
View on GitHub
Official Implementation of ICCV 2023 paper "StyleInV: A Temporal Style Modulated Inversion Network for Unconditional Video Generation"
☆23May 10, 2024Updated 2 years ago
bekirbakar / replay-attack-detection
View on GitHub
Deep learning-based audio spoofing attack detection experiments for speaker verification.
☆14Apr 20, 2023Updated 3 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
seanchen1991 / rust-data-structures
View on GitHub
Just messing around with implementing data structures in Rust.
☆15May 23, 2025Updated last year
qdrant / quaterion-models
View on GitHub
The collection of bulding blocks building fine-tunable metric learning models
☆35Jul 6, 2026Updated 2 weeks ago
MayankG96 / Hackathons-and-challenges
View on GitHub
This repository contains the solution to interesting hackathons, challenges and practice problems
☆11Oct 9, 2022Updated 3 years ago
ilaria-manco / song-describer
View on GitHub
Song Describer is a data collection platform for annotating music with textual descriptions.
☆61Dec 3, 2024Updated last year
ezgisubasi / rasa-travel-chatbot
View on GitHub
Here is my Senior Design Project that I implemented to graduate from Computer Engineering. It is a chatbot made in RASA and helps the use…
☆30Jun 20, 2021Updated 5 years ago
lucidrains / memory-editable-transformer
View on GitHub
My explorations into editing the knowledge and memories of an attention network
☆35Dec 8, 2022Updated 3 years ago
MeLeLBGU / SaGe
View on GitHub
Code for SaGe subword tokenizer (EACL 2023)
☆28Nov 30, 2024Updated last year
facebookresearch / UNIREX
View on GitHub
This is the official PyTorch repo for "UNIREX: A Unified Learning Framework for Language Model Rationale Extraction" (ICML 2022).
☆28Feb 14, 2023Updated 3 years ago
pikiaboy / Investing.com-Scraper
View on GitHub
Scraping Investing.com for gold and silver prices. Small flask server to be able to use the web to grab the data
☆11Jun 9, 2019Updated 7 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
virex-84 / VoskIdentification
View on GitHub
Тестовый пример задействования модели для идентификации голоса с помощью библиотеки распознавания речи "Vosk" (Воск): https://alphacephei…
☆12Aug 14, 2023Updated 2 years ago
roddar92 / linguistics_problems
View on GitHub
Natural language processing in examples and games
☆25May 22, 2026Updated last month
davanstrien / huggingface-tldr
View on GitHub
Experimental tl;dr summaries for datasets on the Hugging Face Hub!
☆10Apr 4, 2024Updated 2 years ago
explosion / spacy-huggingface-pipelines
View on GitHub
💥 Use Hugging Face text and token classification pipelines directly in spaCy
☆65Mar 18, 2024Updated 2 years ago
MrBananaHuman / open-korean-instructions
View on GitHub
언어모델을 학습하기 위한 공개 한국어 instruction dataset들을 모아두었습니다.
☆19Jul 16, 2023Updated 3 years ago
BM-K / KoDiffCSE
View on GitHub
Difference-based Contrastive Learning for Korean Sentence Embeddings
☆23Mar 11, 2026Updated 4 months ago
commoncrawl / web-languages
View on GitHub
Crowd-sourced lists of urls to help Common Crawl crawl under-resourced languages. See https://github.com/commoncrawl/web-languages-code/ …
☆71Jul 1, 2026Updated 2 weeks ago
web-archive-group / heritrix-walkthrough
View on GitHub
☆10Jun 10, 2016Updated 10 years ago
huggingface / fuego
View on GitHub
[WIP] A 🔥 interface for running code in the cloud
☆87May 26, 2026Updated last month
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
sustcsonglin / gated_linear_attention_layer
View on GitHub
☆32Jan 7, 2024Updated 2 years ago
SpeechColab / PySpeechColab
View on GitHub
A library of speech gadgets.
☆15Oct 15, 2022Updated 3 years ago
facebookresearch / 6DoF-Auraliser
View on GitHub
An auralisation system that takes a head-worn microphone array recordings as input and renders the audio for binaural playback; taking in…
☆37Oct 10, 2023Updated 2 years ago
huggingface / pyspark_huggingface
View on GitHub
PySpark custom data source for Hugging Face Datasets
☆26Jul 3, 2026Updated 2 weeks ago
knowledgeable-embedding / knowledgeable-embedding
View on GitHub
Knowledgeable Embedding: Injecting dynamically updatable entity knowledge into embeddings to enhance RAG
☆15Aug 31, 2025Updated 10 months ago
ml-tue / automated-string-cleaning
View on GitHub
Repository for my master thesis on automated string handling
☆17Jul 17, 2021Updated 5 years ago
chrismedrela / docs-guide
View on GitHub
The Hitchhiker's Guide to Documentation!
☆22Apr 25, 2015Updated 11 years ago