institutional / institutional-books-1-pipelineLinks
The Institutional Data Initiative's pipeline for analyzing, refining, and publishing the Institutional Books 1.0 collection.
☆45Updated this week
Alternatives and similar repositories for institutional-books-1-pipeline
Users that are interested in institutional-books-1-pipeline are comparing it to the libraries listed below
Sorting:
- Transformer GPU VRAM estimator☆66Updated last year
- Code for collecting, processing, and preparing datasets for the Common Pile☆234Updated last month
- Chrome Extension for exploring Hugging Face datasets 🔎☆48Updated last year
- Python library to use Pleias-RAG models☆63Updated 5 months ago
- Public repository containing METR's DVC pipeline for eval data analysis☆120Updated 6 months ago
- Aana SDK is a powerful framework for building AI enabled multimodal applications.☆52Updated last month
- ☆29Updated 2 years ago
- Code for pre-training BabyLM baseline models.☆16Updated 2 years ago
- LLM plugin for clustering embeddings☆82Updated last year
- Pivotal Token Search☆128Updated 3 months ago
- lossily compress representation vectors using product quantization☆59Updated 5 months ago
- ☆30Updated 6 months ago
- SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?☆190Updated last week
- ☆57Updated last year
- Pre-train Static Word Embeddings☆87Updated last month
- ☆72Updated 2 months ago
- Benchmark scripts for comparing different tokenizers and sentence segmenters of German☆12Updated 2 years ago
- Efficiently computing & storing token n-grams from large corpora☆26Updated last year
- Small python package to measure OCR quality and other related metrics.☆25Updated last year
- ☆78Updated 10 months ago
- OLMost every training recipe you need to perform data interventions with the OLMo family of models.☆50Updated last week
- ☆23Updated last year
- Your buddy in the (L)LM space.☆64Updated last year
- Flask app for article abstract and listing pages☆172Updated this week
- PyLate efficient inference engine☆66Updated last month
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.☆87Updated last month
- ☆43Updated last month
- Run models distributed as GGUF files using LLM☆76Updated 10 months ago
- Train, tune, and infer Bamba model☆134Updated 4 months ago
- Vector Database with support for late interaction and token level embeddings.☆55Updated 3 months ago