☆25Mar 20, 2024Updated 2 years ago
Alternatives and similar repositories for download-from-common-crawl
Users that are interested in download-from-common-crawl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Various Jupyter notebooks about Common Crawl data☆64Nov 22, 2025Updated 4 months ago
- LiT (Zero-Shot Transfer with Locked-image text Tuning) image and text encoder models, working in the browser☆11May 16, 2022Updated 3 years ago
- ☆20Sep 18, 2025Updated 6 months ago
- Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]☆32Jan 23, 2025Updated last year
- ☆12Mar 4, 2025Updated last year
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- ☆11Sep 27, 2024Updated last year
- CRuby Dev Builds for GitHub Actions☆23Updated this week
- A compute framework for building Search, RAG, Recommendations and Analytics over complex (structured+unstructured) data, with ultra-modal…☆12Sep 16, 2024Updated last year
- A News Article Collection Library☆22Mar 31, 2023Updated 2 years ago
- A python implementation of discrete optimal transport with a Tsallis entropy regularization.☆14Oct 23, 2023Updated 2 years ago
- Discord Docsbot, Built on bgent☆11Jun 17, 2024Updated last year
- an experimental implementation of Burrow's delta in Python 3☆12Jun 6, 2017Updated 8 years ago
- Statistics of Common Crawl monthly archives mined from URL index files☆212Mar 19, 2026Updated last week
- Applying Reinforcement Learning from Human Feedback to language models to teach them to write short story responses to writing prompts.☆14May 5, 2022Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Python code for implementing embeddings in the Wasserstein space of elliptical distributions☆10Jul 22, 2020Updated 5 years ago
- Poetry Corpora Annotated on Aesthetic Emotions☆12Aug 2, 2022Updated 3 years ago
- ☆14Jan 5, 2026Updated 2 months ago
- Implementation of an Openset Recognition algorithm.☆12Sep 13, 2020Updated 5 years ago
- Code to reproduce experiments from the EACL 2017 paper Continouos N-gram representation for Authorship Attribution☆12Feb 6, 2017Updated 9 years ago
- A minimal Typescript library for converting a json-rules-engine condition to a JsonLogic rule specification.☆11Jul 6, 2022Updated 3 years ago
- Deployment of pywb as a CommonCrawl Index Server☆21Oct 6, 2017Updated 8 years ago
- AI Liquidity Management Agent☆13Jan 19, 2026Updated 2 months ago
- ☆20Mar 12, 2024Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆16May 19, 2025Updated 10 months ago
- OpenAI Discord is a AI-powered bot for Discord that leverages the OpenAI API. It enables users to interact with ChatGPT and DALL-E in a n…☆13Aug 3, 2023Updated 2 years ago
- CLAMM V1 & V2☆11Apr 1, 2025Updated 11 months ago
- ☆18Jun 27, 2016Updated 9 years ago
- ☆10Apr 12, 2024Updated last year
- ☆10Mar 7, 2024Updated 2 years ago
- ☆15Apr 25, 2023Updated 2 years ago
- Understanding attention for text classification☆16Nov 27, 2020Updated 5 years ago
- Transient Labs Creator Contracts enabling creators to innovate with their own sovereign smart contracts.☆13Updated this week
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Sentiment analysis of song lyrics compared to auditory track features and valence☆13Feb 19, 2023Updated 3 years ago
- A resource hub for developers, PMs, and designers building LLM-forward products☆14Apr 21, 2025Updated 11 months ago
- Beginner Friendly CheatCodes☆14Aug 6, 2024Updated last year
- ☆14Jun 22, 2024Updated last year
- ☆14Aug 19, 2024Updated last year
- Practical examples of our Embed button, API and Webhooks!☆12Dec 5, 2025Updated 3 months ago
- Figma plugin to fill your text layers with your own predefined JSON data☆10Jan 1, 2023Updated 3 years ago