HumanEval-V / HumanEval-V-Benchmark
A Lightweight Visual Understanding and Reasoning Benchmark for Evaluating Large Multimodal Models through Coding Tasks
☆13Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for HumanEval-V-Benchmark
- ☆53Updated 2 months ago
- A novel approach to improve the safety of large language models, enabling them to transition effectively from unsafe to safe state.☆52Updated 3 weeks ago
- [ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning☆83Updated 5 months ago
- In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)☆45Updated 7 months ago
- The Paper List on Data Contamination for Large Language Models Evaluation.☆74Updated this week
- Multilingual safety benchmark for Large Language Models☆22Updated 2 months ago
- XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts☆27Updated 4 months ago
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆24Updated this week
- Weak-to-Strong Jailbreaking on Large Language Models☆65Updated 8 months ago
- Astraios: Parameter-Efficient Instruction Tuning Code Language Models☆57Updated 7 months ago
- ☆11Updated last month
- The repository of the project "Fine-tuning Large Language Models with Sequential Instructions", code base comes from open-instruct and LA…☆28Updated 3 months ago
- ☆33Updated last year
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆21Updated 4 months ago
- [ACL 2024] Code and data for "Machine Unlearning of Pre-trained Large Language Models"☆45Updated last month
- This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"☆26Updated 4 months ago
- Code for our paper "Defending ChatGPT against Jailbreak Attack via Self-Reminder" in NMI.☆43Updated last year
- ☆31Updated last year
- code for ACL24 "MELoRA: Mini-Ensemble Low-Rank Adapter for Parameter-Efficient Fine-Tuning"☆14Updated 5 months ago
- Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"☆67Updated 5 months ago
- ☆16Updated last month
- [ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.☆54Updated 3 months ago
- Official Repository for ACL 2024 Paper SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding☆97Updated 3 months ago
- UniGen: A Unified Framework for Dataset Generation via Large Language Model☆28Updated last month
- This is the official repo of "QuickLLaMA: Query-aware Inference Acceleration for Large Language Models"☆38Updated 3 months ago
- [ATTRIB @ NeurIPS 2024] When Attention Sink Emerges in Language Models: An Empirical View☆27Updated 3 weeks ago
- AmpleGCG: Learning a Universal and Transferable Generator of Adversarial Attacks on Both Open and Closed LLM☆44Updated last week
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆96Updated 7 months ago
- [ECCV 2024] Official PyTorch Implementation of "How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs"☆67Updated 11 months ago
- ☆27Updated last year