Rust library for indexing and quickly searching large pretraining corpora
☆31Oct 30, 2025Updated 4 months ago
Alternatives and similar repositories for rusty-dawg
Users that are interested in rusty-dawg are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A file-backed dictionary for Python☆12Aug 15, 2022Updated 3 years ago
- Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"☆18Mar 15, 2024Updated 2 years ago
- 3D geoms for plotnine (grammar of graphics in Python)☆12Aug 5, 2022Updated 3 years ago
- SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchi…☆35May 24, 2024Updated last year
- mReasoner is a unified computational implementation of the model theory of thinking and reasoning☆13Aug 17, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Official Implementation of ACL2023: Don't Parse, Choose Spans! Continuous and Discontinuous Constituency Parsing via Autoregressive Span …☆14Aug 25, 2023Updated 2 years ago
- Gantry provides an API that streamlines running experiments in Beaker☆33Mar 11, 2026Updated 2 weeks ago
- Code and data for "A Systematic Assessment of Syntactic Generalization in Neural Language Models"☆29Jun 18, 2021Updated 4 years ago
- Rust library for working with data from Wikidata.☆14Jul 10, 2025Updated 8 months ago
- Rust implementation of probminhash, superminhash and hyperloglog sketching algorithms☆31Jan 22, 2026Updated 2 months ago
- Code for "Discovering Non-monotonic Autoregressive Orderings with Variational Inference" (paper and code updated from ICLR 2021)☆12Mar 7, 2024Updated 2 years ago
- Random program generator for Python☆10Jun 20, 2013Updated 12 years ago
- Corpus of naturalistic stories with annotation and psycholinguistic measures☆60May 14, 2025Updated 10 months ago
- ☆10Sep 6, 2024Updated last year
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Investigating Cultural Alignment of Large Language Models☆13Aug 14, 2024Updated last year
- ACL style for Typst☆22Jan 27, 2026Updated 2 months ago
- Codebase describing experiments in Truncation Sampling as Language Model Desmoothing☆13Dec 6, 2022Updated 3 years ago
- ☆23Jun 23, 2022Updated 3 years ago
- The corresponding code for our paper: "Exploring the Challenges of Open Domain Multi-Document Summarization". Do not hesitate to open an …☆33Jun 24, 2023Updated 2 years ago
- Data and code for the SciFact-Open task☆28Nov 24, 2023Updated 2 years ago
- This repository contains two datasets with multi-turn adversarial conversations generated by human agents interacting with a dialog model…☆32Jul 16, 2024Updated last year
- Easy trees in LaTeX and TikZ☆14Dec 16, 2022Updated 3 years ago
- Code for the ACL 2021 paper "Structural Guidance for Transformer Language Models"☆13Sep 17, 2025Updated 6 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- 🦀 A Rust implementation of a RoBERTa classification model for the SNLI dataset☆13Sep 13, 2021Updated 4 years ago
- The Earleyx parser was originated from Roger Levy's prefix parser, but has evolved significantly. Earleyx can generate Viterbi parses and…☆15Mar 27, 2014Updated 12 years ago
- This repository contains code used for our Multi Sentence Inference NAACL'22 paper.☆12Mar 6, 2023Updated 3 years ago
- A vim plugin used to auto insert code comment header block☆16Apr 4, 2013Updated 12 years ago
- Code Release for the 2023 NeurIPS Paper How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained langua…☆17Dec 6, 2024Updated last year
- ☆18Jan 17, 2024Updated 2 years ago
- Simple, extensible implementations of some meta-learning algorithms in Jax☆11Oct 6, 2020Updated 5 years ago
- Fast dataset format and loader☆24Updated this week
- Second Order Implementation of Hidden Markov Model for Tagging.☆15Mar 17, 2022Updated 4 years ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- SAP Benchmark☆27Sep 18, 2024Updated last year
- Prism themes for plotnine, inspired by ggprism.☆17Aug 13, 2025Updated 7 months ago
- This is a read-only mirror of the CRAN R package repository. randomForestSRC — Fast Unified Random Forests for Survival, Regression, an…☆10Feb 12, 2026Updated last month
- Terraform module to create an Elastic Kubernetes (EKS) cluster and associated worker instances on AWS☆14Aug 26, 2020Updated 5 years ago
- Drop-in replacements for Python's map function☆15Sep 5, 2023Updated 2 years ago
- A string tokenizer library for Rust☆11May 16, 2018Updated 7 years ago
- source code of COLING2020 "Second-Order Unsupervised Neural Dependency Parsing"☆16Oct 24, 2022Updated 3 years ago