allenai / dolma3View external linksLinks
☆46Jan 20, 2026Updated 3 weeks ago
Alternatives and similar repositories for dolma3
Users that are interested in dolma3 are comparing it to the libraries listed below
Sorting:
- Tooling for exact and MinHash deduplication of large-scale text datasets☆68Feb 4, 2026Updated last week
- decontamination☆24Dec 3, 2025Updated 2 months ago
- An open-source session replay tool for single-page applications that uses AI analysis, aggregated trends, and a RAG chatbot to help devel…☆11Jan 23, 2026Updated 3 weeks ago
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆40Jan 4, 2024Updated 2 years ago
- [SIGIR 2023] Schema-aware Reference as Prompt Improves Data-Efficient Knowledge Graph Construction☆42Apr 5, 2023Updated 2 years ago
- 是APEX贡献的一个基于大数据平台能力的数据开发平台,帮助企业以最小成本实现链接数据,构建和沉淀数仓模型,降低数据应用门槛,沉淀数据价值。☆12Oct 31, 2024Updated last year
- Simplifies data migration between Apache Ignite clusters by relying on Apache Avro as an intermediate storage format☆13Jun 27, 2023Updated 2 years ago
- Official Implementation of "Learning to Refuse: Towards Mitigating Privacy Risks in LLMs"☆10Dec 13, 2024Updated last year
- ☆17Aug 5, 2025Updated 6 months ago
- This is code for the EMNLP 2022 Paper "UniRPG: Unified Discrete Reasoning over Table and Text as Program Generation".☆10Apr 30, 2023Updated 2 years ago
- KuaiSearch PERKS☆12Nov 16, 2021Updated 4 years ago
- Using OpenVINO to speed up inference of PaddleOCR-VL model☆22Feb 3, 2026Updated last week
- Collaborative Discourse Manager☆11Nov 6, 2016Updated 9 years ago
- Azure Machine Learning - MLOps Python SDKv2☆10Jul 24, 2023Updated 2 years ago
- Simplified DOM Trees for Transferable Attribute Extraction from the Web☆40Sep 27, 2024Updated last year
- Get aid from local LLMs right in your PowerShell☆15May 2, 2025Updated 9 months ago
- ElectronJS app to use Groq's Whisper model from a terminal on the desktop.☆11Updated this week
- [ICML 2024] Code for the paper "MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-Experts"☆10Jul 1, 2024Updated last year
- BERT score for text generation☆12Jan 15, 2025Updated last year
- A game engine made in Java using libgdx (Currently in alpha state, and probably will remain that way)☆16Jan 4, 2012Updated 14 years ago
- ☆11Oct 11, 2023Updated 2 years ago
- ☆15May 11, 2025Updated 9 months ago
- Upload a document image or PDF, or provide a URL, to convert it into a structured format using SmolDocling.☆16Mar 31, 2025Updated 10 months ago
- Run all the tests at the same time with modal.com☆11Mar 2, 2024Updated last year
- A paper comparing Dask and Spark☆10Dec 9, 2022Updated 3 years ago
- This is a sample project where we can get the exact use case of pythons multi threading.☆11Oct 6, 2020Updated 5 years ago
- On-the-fly Table Generation - SIGIR'18☆10Feb 1, 2020Updated 6 years ago
- 小模型LLM的搭建,学习LLM的建模、训练过程 基于DeepSeek-MOE架构的小模型,用于个人学习,从0开始,解释每一条语句☆14Mar 28, 2025Updated 10 months ago
- Automaton & Cognition☆16Apr 14, 2024Updated last year
- Attempt to understand Percy Liang's Dependency-based Compositional Semantics by implementing it in Python☆10Mar 10, 2013Updated 12 years ago
- PDF table extraction☆10Dec 14, 2021Updated 4 years ago
- 批量监控指定QQ消息窗口并将新消息发送至邮箱☆11Apr 13, 2023Updated 2 years ago
- AWS Sample for extracting sensor data and detecting scenes from autonomous driving data collected in ROS bag files.☆12Sep 27, 2021Updated 4 years ago
- ☆11Jan 13, 2013Updated 13 years ago
- init☆13Feb 3, 2021Updated 5 years ago
- Huawei-Matebook-X-Pro-2018 黑苹果☆12Jun 19, 2023Updated 2 years ago
- ☆11Feb 5, 2026Updated last week
- A smart distributed crawler that infers navigation models of structured websites, used to cluster pages based on their structure and extr…☆10Aug 17, 2025Updated 5 months ago
- 🍎Wende Chinese QA system (experimental)☆10Jun 1, 2021Updated 4 years ago