Tooling for exact and MinHash deduplication of large-scale text datasets
☆78Mar 24, 2026Updated 3 weeks ago
Alternatives and similar repositories for duplodocus
Users that are interested in duplodocus are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Data mapping framework for rust stuff☆51Mar 25, 2026Updated 2 weeks ago
- ☆17Aug 5, 2025Updated 8 months ago
- ☆84Apr 7, 2026Updated last week
- Evaluating language models on word puzzle games☆10Oct 25, 2024Updated last year
- FROM $f(x)$ AND $g(x)$ TO $f(g(x))$: LLMs Learn New Skills in RL by Composing Old Ones☆65Jan 26, 2026Updated 2 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Official repository of PhysMaster: Mastering Physical Representation for Video Generation via Reinforcement Learning☆57Oct 16, 2025Updated 5 months ago
- decontamination☆30Mar 4, 2026Updated last month
- Official Implementation of wd1☆26Sep 25, 2025Updated 6 months ago
- Python client to interact with the lean4 language server.☆41Mar 17, 2026Updated 3 weeks ago
- It's a cooler way to store simple linear models.☆26Jul 15, 2024Updated last year
- Fork of Flame repo for training of some new stuff in development☆19Updated this week
- ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling☆113Mar 31, 2026Updated 2 weeks ago
- ☆21Jun 4, 2025Updated 10 months ago
- ☆56Mar 18, 2026Updated 3 weeks ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".☆30Nov 12, 2024Updated last year
- Official implementation of "On the Effectiveness of Lipschitz-Driven Rehearsal in Continual Learning"☆15Oct 13, 2022Updated 3 years ago
- Single-pass Adaptive Image Tokenization for Minimum Program Search | What's the Kolmogorov Complexity of an Image?☆42Jul 26, 2025Updated 8 months ago
- ☆43Aug 5, 2025Updated 8 months ago
- Code examples for lecture series☆33Nov 4, 2024Updated last year
- Comprehensive LLM evaluation at scale: A production-ready framework for evaluating large language models across multiple benchmarks.☆38Updated this week
- ☆28Sep 22, 2025Updated 6 months ago
- ☆23Nov 26, 2024Updated last year
- Library classes for the Twelf Proof System☆23Jun 16, 2020Updated 5 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆23Jun 25, 2021Updated 4 years ago
- This is the code repository for the AI project template. The idea of this template is to have a code framework prepared for any AI/ML/MLO…☆41Jan 26, 2026Updated 2 months ago
- (Pytorch and Tensorflow) Implementation of Weighted Contrastive Loss (Deep Metric Learning by Online Soft Mining and Class-Aware Attentio…☆21Oct 21, 2019Updated 6 years ago
- Revamped: Hugo+LoveIt☆10Mar 14, 2026Updated last month
- A markdown native slides tool for academics building with agents.☆125Apr 1, 2026Updated last week
- ☆89Apr 7, 2026Updated last week
- CVPR 2022 Continual Learning in Computer Vision Workshop Challenge☆27Dec 15, 2022Updated 3 years ago
- A comprehensive framework for benchmarking single and multi-agent systems across a wide range of tasks—evaluating performance, accuracy, …☆37Nov 11, 2025Updated 5 months ago
- ☆27Mar 4, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆70Updated this week
- [COLM '25] Single-Pass Document Scanning for Question Answering☆13Aug 20, 2025Updated 7 months ago
- ☆19Mar 3, 2026Updated last month
- Cloud Native Distributed Nearest Neighbour Search☆15Jun 9, 2020Updated 5 years ago
- ☆14May 16, 2024Updated last year
- Website for hosting the Open Foundation Models Cheat Sheet.☆270May 7, 2025Updated 11 months ago
- The diary of a n00b in the unfamiliar and terrifying venture into Capture-the-flag.☆12Oct 14, 2018Updated 7 years ago