Support finetuning GLM4v with zero2
☆16Jun 29, 2024Updated last year
Alternatives and similar repositories for GLM4v-Finetune
Users that are interested in GLM4v-Finetune are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [CVPR 2026] HiconAgent: History Context-aware Policy Optimization for GUI Agents☆27Mar 9, 2026Updated 3 weeks ago
- ☆12Aug 8, 2024Updated last year
- Data set for the IEEE TGRS paper "Mutual Attention Inception Network for Remote Sensing Visual Question Answering"☆22Nov 14, 2022Updated 3 years ago
- AgentProg: Empowering Long-Horizon GUI Agents with Program-Guided Context Management☆25Mar 17, 2026Updated last week
- ☆11Apr 23, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆12Aug 24, 2020Updated 5 years ago
- [BMVC 2022] Information Theoretic Representation Distillation☆19Oct 6, 2023Updated 2 years ago
- Accompany code to reproduce the baselines of the International Multimodal Sentiment Analysis Challenge (MuSe 2020).☆16Dec 8, 2022Updated 3 years ago
- Multi-modal fusion framework based on Transformer Encoder☆16Dec 20, 2020Updated 5 years ago
- About Code Release for "CLIPood: Generalizing CLIP to Out-of-Distributions" (ICML 2023), https://arxiv.org/abs/2302.00864☆70Sep 17, 2023Updated 2 years ago
- Tool to parse wiki tables from the HTML dump of Wikipedia☆11Jun 12, 2022Updated 3 years ago
- Code for "Revisiting Batch Norm Initialization".☆12Jul 14, 2022Updated 3 years ago
- A curated list of zero-shot captioning papers☆24Aug 26, 2023Updated 2 years ago
- [IJCAI 2024] CMMU: A Benchmark for Chinese Multi-modal Multi-type Question Understanding and Reasoning☆25Feb 1, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- ☆19Sep 18, 2021Updated 4 years ago
- ☆12Feb 27, 2025Updated last year
- Official code of "IRNet: Iterative Refinement Network for Noisy Partial Label Learning"☆21Oct 8, 2025Updated 5 months ago
- The official codes for Fast Monte Carlo Rendering via Multi-Resolution Sampling☆15Dec 2, 2021Updated 4 years ago
- This repository is an official implementation of the paper A Simple Baseline for Open-World Tracking via Self-training.☆10Jan 26, 2024Updated 2 years ago
- Modeling Stroke Mask for End-to-End Text Erasing☆19Feb 9, 2023Updated 3 years ago
- 基于Langchain的学术论文RAG知识库系统☆16Sep 25, 2024Updated last year
- ☆35Jun 20, 2024Updated last year
- [AAAI 2024] Continuous Piecewise-Affine Based Motion Model for Image Animation☆35Jan 19, 2024Updated 2 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- An implementation of DSOD in Pytonch☆15Jul 13, 2018Updated 7 years ago
- LoRA☆17Apr 15, 2023Updated 2 years ago
- Automatically generates captions for an image using Image processing and NLP. Model was trained on Flickr30K dataset.☆11Jun 11, 2020Updated 5 years ago
- m&ms: A Benchmark to Evaluate Tool-Use for multi-step multi-modal tasks☆45Sep 26, 2024Updated last year
- The implement of paper "Asymmetric Contextual Modulation for Infrared Small Target Detection" in Pytorch☆58Dec 4, 2020Updated 5 years ago
- AnyTrans: Translate AnyText in the Image with Large Scale Models (EMNLP2024 Findings)☆24Dec 11, 2024Updated last year
- ☆10Jan 3, 2023Updated 3 years ago
- Phonetically balanced text to speech sentences☆10Aug 16, 2021Updated 4 years ago
- ☆10Sep 19, 2022Updated 3 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- MAT: Mask-Aware Transformer for Large Hole Image Inpainting☆17Apr 1, 2022Updated 3 years ago
- [NAACL 2025] Guiding Large Language Models in Code Execution with Fine-grained Multimodal Chain-of-Thought Reasoning☆12Feb 9, 2025Updated last year
- [CVPR 2024] Leveraging Vision-Language Models for Improving Domain Generalization in Image Classification☆40Mar 6, 2024Updated 2 years ago
- Real Time Object Detection By Using YOLO to online shopping☆11Mar 17, 2019Updated 7 years ago
- T-Circle-S3 is a development board with a 0.75-inch onboard small screen developed based on the ESP32S3, equipped with a speaker, microph…☆31Oct 25, 2025Updated 5 months ago
- ☆11Apr 23, 2023Updated 2 years ago
- 小智性能优化研究,公开数据报告,助力低成本高性能落地☆63Dec 15, 2025Updated 3 months ago