KnightofDawn / books-1
IT技术书籍文字版mobi epub格式
☆9Updated 5 years ago
Related projects ⓘ
Alternatives and complementary repositories for books-1
- Code and Data Repo for NeurIPS 2024 Paper "Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning"☆13Updated 5 months ago
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)☆33Updated 3 weeks ago
- Code for ACL 2024 paper "Soft Self-Consistency Improves Language Model Agents"☆16Updated 2 months ago
- Code for the paper: Rehearsal-free Continual Language Learning via Efficient Parameter Isolation☆13Updated last year
- CLIP-MoE: Mixture of Experts for CLIP☆17Updated last month
- Arxiv daily paper downloader and manage papers with markdown preview.☆29Updated 4 months ago
- A repository of useful research/skill-upgrading talks or acticles in NLP/CV/AI Area (in Chinese).☆69Updated 3 months ago
- CHAIR metric is a rule-based metric for evaluating object hallucination in caption generation.☆23Updated last year
- ☆10Updated 2 years ago
- A Self-Training Framework for Vision-Language Reasoning☆16Updated last week
- PyTorch implementation of StableMask (ICML'24)☆12Updated 4 months ago
- This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual Debias Decoding strat…☆72Updated 7 months ago
- 2020年秋国科大模式识别(刘成林、向世明、张煦尧)课后作业☆9Updated 3 years ago
- ☆53Updated 7 months ago
- MMFakeBench: A Mixed-Source Multimodal Misinformation Detection Benchmark for LVLMs☆14Updated 3 months ago
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".☆100Updated 2 weeks ago
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆32Updated last week
- [ACL 2024] Multi-modal preference alignment remedies regression of visual instruction tuning on language model☆25Updated last week
- Code for EMNLP 2022 paper “Distilled Dual-Encoder Model for Vision-Language Understanding”☆29Updated last year
- EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions☆16Updated 5 months ago
- Do Vision and Language Models Share Concepts? A Vector Space Alignment Study☆12Updated last month
- ☆22Updated last year
- ☆22Updated 2 years ago
- ☆38Updated 5 months ago
- Data for evaluating GPT-4V☆11Updated last year
- Official implementation for "Android in the Zoo: Chain-of-Action-Thought for GUI Agents" (Findings of EMNLP 2024)☆48Updated last month
- Multi-GPU supported kmeans clustering for cluser-clip☆9Updated 5 months ago
- ☆25Updated last month
- ☆13Updated 2 months ago
- [CVPR 23] Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!☆14Updated 6 months ago