analysis of public NLP corpora
☆11Feb 9, 2023Updated 3 years ago
Alternatives and similar repositories for whatsinthebox
Users that are interested in whatsinthebox are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Data Annotation Tool for Named Entity Recognition using Active Learning and Transfer Learning☆10Aug 20, 2021Updated 4 years ago
- An open-source toolkit for analyzing line-oriented JSON Twitter archives with Apache Spark.☆10Mar 17, 2026Updated last month
- Stolemojis never die. A collection of Slack emojis from past, present, and future companies.☆10Feb 5, 2026Updated 2 months ago
- utility to fetch provenance information from Internet Archive's Wayback Machine☆15Feb 5, 2026Updated 2 months ago
- Repository for the paper Do SSL Models Have Déjà Vu? A Case of Unintended Memorization in Self-supervised Learning☆36May 2, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Code for FACTOID dataset paper in LREC 2022☆18Dec 19, 2022Updated 3 years ago
- A TensorFlow implementation of perceptual generative autoencoder (PGA).☆22Nov 2, 2020Updated 5 years ago
- Seamlessly build the MuMiN dataset.☆31Dec 20, 2023Updated 2 years ago
- Tools to make Supernote devices even more super☆25Mar 7, 2026Updated last month
- ☆11Sep 20, 2019Updated 6 years ago
- Automating description for Web Archives in ArchivesSpace using the Archive-It CDX and Partner Data APIs☆11Aug 10, 2018Updated 7 years ago
- Repository of documentation about the open datasets published by the UK Web Archive.☆15Jun 21, 2019Updated 6 years ago
- Colored Console primitives for Rust CLIs☆18Jul 3, 2023Updated 2 years ago
- GraphPass is a utility to filter networks and provide a default visualization output for Gephi or SigmaJS.☆17Nov 14, 2020Updated 5 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Use Gnip's API to create and control Historical Powertrack jobs.☆17Sep 24, 2016Updated 9 years ago
- PyCon India 2019 Tasks and Coordination☆11Oct 9, 2019Updated 6 years ago
- Interactive hierarchical network navigator☆15May 2, 2023Updated 2 years ago
- dataset of podcasts and episodes☆14Jan 16, 2018Updated 8 years ago
- WIP: Parse archived parler pages into structured html☆15Feb 16, 2021Updated 5 years ago
- datamining the parler datadump☆15Jan 19, 2021Updated 5 years ago
- Notes from our paper reading sessions☆16Sep 24, 2020Updated 5 years ago
- An analytics engineering sandbox focusing on real estates prices in Cook County, IL☆10Aug 5, 2025Updated 8 months ago
- ☆11Apr 9, 2022Updated 4 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆12Feb 2, 2024Updated 2 years ago
- Ruby and MongoDB integration for the back end of any new Twitter analytics/processing project☆32May 8, 2025Updated 11 months ago
- Python wrapper for the Java-based Maximal Information-based Nonparametric Exploration (MINE) statistics library☆19Feb 3, 2012Updated 14 years ago
- A LevelDB backed URL unshortening microservice written in JavaScript☆31Dec 10, 2022Updated 3 years ago
- ☆13Feb 24, 2020Updated 6 years ago
- EquiformerV3: Scaling Efficient, Expressive, and General SE(3)-Equivariant Graph Attention Transformers☆59Apr 13, 2026Updated last week
- The FORT team has released differentially private Condor data to external researchers in H1 2020. It is known that analyzing DP data via …☆13Jan 12, 2026Updated 3 months ago
- Analyzing crime reported in the U.S. using data derived from commoncrawl, New York Times api and twitter data.☆18Aug 28, 2019Updated 6 years ago
- ☆11Sep 16, 2021Updated 4 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Here is the repository containing our code implementation of Spatio-Temporal Graph Transformer (STGormer).☆14Aug 25, 2024Updated last year
- Code and data to support "Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4"☆70May 2, 2023Updated 2 years ago
- Fine-tuning GPT-2 on articles followed by text generation☆23Apr 19, 2022Updated 4 years ago
- The official implementation of Spatiotemporal Gated Traffic Trajectory Simulation with Semantic-aware Graph Learning (Information Fusion …☆10May 6, 2024Updated last year
- NVScript - Dark Engine script module for Thief 2, Thief 1, and System Shock 2☆14Mar 27, 2026Updated 3 weeks ago
- Web archiving utility library☆11Mar 11, 2026Updated last month
- Logical Operations On Puzzles: Simple Iterative Reasoning Tests for LLMs first through wordgrids☆18Feb 19, 2025Updated last year