Tooling for exact and MinHash deduplication of large-scale text datasets
☆76Mar 16, 2026Updated last week
Alternatives and similar repositories for duplodocus
Users that are interested in duplodocus are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆62Jan 20, 2026Updated 2 months ago
- ☆17Aug 5, 2025Updated 7 months ago
- PyTorch building blocks for the OLMo ecosystem☆967Updated this week
- decontamination☆27Mar 4, 2026Updated 2 weeks ago
- FROM $f(x)$ AND $g(x)$ TO $f(g(x))$: LLMs Learn New Skills in RL by Composing Old Ones☆64Jan 26, 2026Updated last month
- Official Implementation of wd1☆24Sep 25, 2025Updated 6 months ago
- ☆19Jun 4, 2025Updated 9 months ago
- Collection of LLM completions for reasoning-gym task datasets☆31Jul 4, 2025Updated 8 months ago
- A markdown native slides tool for academics building with agents.☆76Updated this week
- ☆121Feb 17, 2026Updated last month
- ☆55Mar 18, 2026Updated last week
- Building the cognitive-core to solve ARC-AGI-2☆27Feb 2, 2025Updated last year
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".☆30Nov 12, 2024Updated last year
- Download and preperation tool for free speech corpora.☆16Apr 28, 2019Updated 6 years ago
- ☆43Aug 5, 2025Updated 7 months ago
- Single-pass Adaptive Image Tokenization for Minimum Program Search | What's the Kolmogorov Complexity of an Image?☆42Jul 26, 2025Updated 7 months ago
- ☆53Aug 5, 2025Updated 7 months ago
- ☆70Mar 17, 2026Updated last week
- ☆23Nov 26, 2024Updated last year
- [ICML 2025] SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity☆71Mar 10, 2026Updated 2 weeks ago
- ☆20Oct 10, 2025Updated 5 months ago
- The unified framework for sim & real robot teleoperation☆64Updated this week
- Revamped: Hugo+LoveIt☆10Mar 14, 2026Updated last week
- Runtime types for OCaml (beta version)☆27Jan 15, 2026Updated 2 months ago
- EAFT(Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting) official repo☆93Jan 15, 2026Updated 2 months ago
- A framework to study AI models in Reasoning, Alignment, and use of Memory (RAM).☆345Updated this week
- ☆26Mar 21, 2024Updated 2 years ago
- ☆19Mar 3, 2026Updated 3 weeks ago
- ☆21Mar 2, 2026Updated 3 weeks ago
- Hypercorn is an ASGI and WSGI Server based on Hyper libraries and inspired by Gunicorn.☆14Jan 12, 2026Updated 2 months ago
- A comprehensive framework for benchmarking single and multi-agent systems across a wide range of tasks—evaluating performance, accuracy, …☆36Nov 11, 2025Updated 4 months ago
- ☆26Mar 4, 2025Updated last year
- Implementation of the Delta Language☆13Mar 18, 2024Updated 2 years ago
- Turn your Solana Seeker (or any Android phone) into a 24/7 personal AI agent☆61Updated this week
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…☆84Oct 29, 2024Updated last year
- ☆68Updated this week
- Using PyTorch autograd to compute Hessian of Perplexity for Large Language Models☆29Apr 17, 2025Updated 11 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆13Nov 27, 2023Updated 2 years ago
- [Oral; Neurips OPT2024 ] μLO: Compute-Efficient Meta-Generalization of Learned Optimizers☆16Feb 12, 2026Updated last month