Useful tools to extract malayalam text from the Common Crawl Datasets
☆28Dec 11, 2024Updated last year
Alternatives and similar repositories for common-crawl-malayalam
Users that are interested in common-crawl-malayalam are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Language Modeling and Text Classification in Malayalam Language using ULMFiT☆73Dec 8, 2022Updated 3 years ago
- Malayalam Corpus by Swathanthra Malayalam Computing☆20Apr 2, 2023Updated 3 years ago
- Process Common Crawl data with Python and Spark☆453Mar 26, 2026Updated 3 weeks ago
- A Machine Learning tool to create the training dataset very quickly & easily by using a smart chrome extension☆14Feb 11, 2023Updated 3 years ago
- ☆11Dec 10, 2020Updated 5 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆14May 10, 2024Updated last year
- Example project for building scalable data pipelines with Kedro and Ibis.☆14Dec 10, 2025Updated 4 months ago
- We train and deploy a serverless Sentiment Analysis API to GCP by using BERT (DistilBERT), TensorFlow, FastAPI, Python, Google AI Platfor…☆12Mar 26, 2021Updated 5 years ago
- AWS Blog post code for running feature-extraction on images using AWS Batch and Cloud Development Kit (CDK).☆20Oct 28, 2022Updated 3 years ago
- Common web archive utility code.☆63Apr 1, 2026Updated 2 weeks ago
- ROS driver for RS-LiDAR-16 and RS-LiDAR-32☆11Mar 25, 2019Updated 7 years ago
- A Prot paper related materials☆11Sep 5, 2022Updated 3 years ago
- Sort-friendly URI Reordering Transform (SURT) python module☆45Sep 11, 2025Updated 7 months ago
- paper2code is a collection of AI/ML research papers rebuilt in Python — stripped of the abstractions that hide what's actually happening.☆23Updated this week
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Tool to bridge Blender animation and physics-based robotic simulation☆17Feb 27, 2026Updated last month
- EpiTator annotates epidemiological information in text documents. It is the natural language processing framework that powers GRITS and E…☆42Jun 21, 2022Updated 3 years ago
- ☆19Sep 4, 2021Updated 4 years ago
- In this brief post I’d like to share my experience with the Kaggle Python Docker image, which simplifies the Data Scientist’s life ….☆10Jan 8, 2018Updated 8 years ago
- This repo contains self made projects and learnables from various resources on using local LLMs and RAG☆14May 26, 2025Updated 10 months ago
- Code referenced in the manuscript 'The 16S rRNA gene for species and strain-level microbiome analysis'☆11Sep 26, 2019Updated 6 years ago
- Chapter 13 Learning to Run in book Deep Reinforcement Learning: code example of solving NIPS 2017: Learning to Run challenge with paralle…☆13Jul 4, 2021Updated 4 years ago
- An attempt to create a sensor fusion model for camera & laser scanner inputs for Autonomous Vehicles☆14Jul 25, 2021Updated 4 years ago
- ☆15Aug 18, 2021Updated 4 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆14Aug 3, 2022Updated 3 years ago
- Scripts to reproduce analyses of tradeSeq paper.☆16Feb 5, 2020Updated 6 years ago
- Makhber is a free application for Visualization and Analysis of Scientific Data☆21Jun 18, 2025Updated 10 months ago
- Shiny application implementing Consensus Clustering for various clustering algorithms.☆13Jul 6, 2018Updated 7 years ago
- Jupyter notebook repository for reproducing analysis in "DNA Methylation Landscape of the Mouse Brain at Single-Cell Resolution"☆14Apr 1, 2020Updated 6 years ago
- SAC + CPL training humanoids to play piano☆13Mar 30, 2025Updated last year
- Stratified squared trans-ethnic genetic correlation☆14May 12, 2022Updated 3 years ago
- The sophia project is the coding-based tutorial for machine intelligence (MI) and artificial intelligence (AI).☆18Apr 1, 2022Updated 4 years ago
- ☆17Dec 21, 2020Updated 5 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆11Jun 4, 2020Updated 5 years ago
- Code to reproduce analyses in Nasser, Bergman, Fulco, Guckelberger, Doughty et al Nature 2021☆16Apr 8, 2021Updated 5 years ago
- Soccer Ball Detection and Tracking - 3DV☆14Jun 15, 2021Updated 4 years ago
- covid question answering datasets and fine tuned models☆18Apr 27, 2021Updated 4 years ago
- Use cases for anndata.☆14Jun 11, 2025Updated 10 months ago
- Russian words synonyms and antonyms☆11Dec 7, 2021Updated 4 years ago
- A suite of tools for processing genotype data. Includes calling genotypes from .idat to plink (ped), sample/case-control variant QC steps…☆15Feb 5, 2026Updated 2 months ago