huggingface / chug
Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.
☆151Updated 7 months ago
Related projects ⓘ
Alternatives and complementary repositories for chug
- M4 experiment logbook☆56Updated last year
- Manage scalable open LLM inference endpoints in Slurm clusters☆236Updated 4 months ago
- Generalised Contrastive Learning. This is a Repository for Google Shopping Dataset and Benchmarks followed by our novel fine-grained cont…☆48Updated 2 weeks ago
- Code used for the creation of OBELICS, an open, massive and curated collection of interleaved image-text web documents, containing 141M d…☆189Updated 2 months ago
- Python Library to evaluate VLM models' robustness across diverse benchmarks☆169Updated 3 weeks ago
- code for training & evaluating Contextual Document Embedding models☆117Updated this week
- E5-V: Universal Embeddings with Multimodal Large Language Models☆173Updated 4 months ago
- The official repo for the paper "VeCLIP: Improving CLIP Training via Visual-enriched Captions"☆228Updated 2 months ago
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.☆130Updated this week
- ☆108Updated this week
- Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind☆169Updated 2 months ago
- ☆58Updated 8 months ago
- ☆73Updated 4 months ago
- LL3M: Large Language and Multi-Modal Model in Jax☆65Updated 6 months ago
- Set of scripts to finetune LLMs☆36Updated 7 months ago
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆252Updated last year
- Fast, Modern, Memory Efficient, and Low Precision PyTorch Optimizers☆58Updated 4 months ago
- Implementation of Infini-Transformer in Pytorch☆104Updated last month
- Multipack distributed sampler for fast padding-free training of LLMs☆178Updated 3 months ago
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.☆84Updated 2 months ago
- Scaling Data-Constrained Language Models☆321Updated last month
- ☆64Updated last year
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆29Updated last month
- Understand and test language model architectures on synthetic tasks.☆162Updated 6 months ago
- ☆77Updated 5 months ago
- ☆115Updated 3 weeks ago
- A fast implementation of T5/UL2 in PyTorch using Flash Attention☆71Updated last month
- ☆175Updated this week
- ☆62Updated last month
- Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of new…☆119Updated 3 months ago