deepmancer / vlm-toolboxLinks
Vision-Language Models Toolbox: Your all-in-one solution for multimodal research and experimentation
☆12Updated 11 months ago
Alternatives and similar repositories for vlm-toolbox
Users that are interested in vlm-toolbox are comparing it to the libraries listed below
Sorting:
- Library for converting from RGB / GrayScale image to base64 and back.☆19Updated 3 years ago
- ☆28Updated last year
- Various test models in WNNX format. It can view with `pip install wnetron && wnetron`☆12Updated 3 years ago
- Gemma2(9B), Llama3-8B-Finetune-and-RAG, code base for sample, implemented in Kaggle platform☆22Updated last year
- Minimal zero-shot intent classifier for arbitrary intent slot filling, via LLM prompting w LangChain.☆37Updated 2 years ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆32Updated 4 months ago
- Companion Repo for the Vision Language Modelling YouTube series - https://bit.ly/3PsbsC2 - by Prithivi Da. Open to PRs and collaborations☆14Updated 3 years ago
- Code for AAAI 2023 Paper : “Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models”☆18Updated 3 years ago
- Using short models to classify long texts☆21Updated 2 years ago
- DocAI helps developers quickly build document, image and text processing pipelines using open source and cloud-based machine learning mod…☆20Updated 3 years ago
- Repository containing awesome resources regarding Hugging Face tooling.☆48Updated 2 years ago
- ☆21Updated 3 years ago
- 🐜🔧 A minimalistic tool to fine-tune your LLMs☆18Updated 2 years ago
- A tiny package supporting distributed computation of COCO metrics for PyTorch models.☆15Updated 2 years ago
- A streamlit component to embed Disqus in your applications.☆10Updated 4 years ago
- Benchmarks for Business Document Foundation Models☆10Updated last year
- minimal scripts for 24GB VRAM GPUs. training, inference, whatever☆50Updated last month
- code for paper "Accessing higher dimensions for unsupervised word translation"☆22Updated 2 years ago
- Code for NeurIPS LLM Efficiency Challenge☆60Updated last year
- Generating Summaries with Controllable Readability Levels (EMNLP 2023)☆14Updated 6 months ago
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data☆23Updated last year
- ☆13Updated 3 years ago
- ☆44Updated 4 years ago
- Code for paper: "Privately generating tabular data using language models".☆15Updated 2 years ago
- Generating Training Data Made Easy☆43Updated 5 years ago
- ☆31Updated 2 years ago
- Streamlit demo app to demonstrate the features of transformers interpret with multiple models.☆25Updated 4 years ago
- ☆23Updated last year
- An unofficial Implementation of DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents☆37Updated 2 years ago
- ☆28Updated 2 years ago