WebPAI / MRWebLinks
☆35Updated 9 months ago
Alternatives and similar repositories for MRWeb
Users that are interested in MRWeb are comparing it to the libraries listed below
Sorting:
- basically all the things I used for this article☆25Updated last year
- ☆32Updated 10 months ago
- ☆40Updated last year
- MTTM: Metamorphic Testing for Textual Content Moderation Software☆32Updated 2 years ago
- Multilingual safety benchmark for Large Language Models☆54Updated last year
- ☆37Updated last year
- Code and data for the paper: On the Humanity of Conversational AI: Evaluating the Psychological Portrayal of LLMs☆128Updated 3 weeks ago
- [ICLR 2025] ChartMimic: Evaluating LMM’s Cross-Modal Reasoning Capability via Chart-to-Code Generation☆130Updated 3 weeks ago
- Code and data for the paper: Apathetic or Empathetic? Evaluating LLMs' Emotional Alignments with Humans☆118Updated 3 weeks ago
- Code for the paper "Exploring Backdoor Vulnerabilities of Chat Models"☆18Updated last year
- [2025-TMLR] A Survey on the Honesty of Large Language Models☆64Updated last year
- [ICCV 2025] The official code of the paper "Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration R…☆109Updated 6 months ago
- ☆59Updated last year
- Code for ACL 2024 paper "Soft Self-Consistency Improves Language Model Agents"☆25Updated last year
- A benchmark for evaluating vision-centric, complex video reasoning.☆35Updated 2 weeks ago
- Laser: Learn to Reason Efficiently with Adaptive Length-based Reward Shaping☆61Updated 7 months ago
- Game-RL: Synthesizing Multimodal Verifiable Game Data to Boost VLMs' General Reasoning☆127Updated 3 weeks ago
- ☆19Updated 2 months ago
- [EMNLP 2024] ”ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models“☆25Updated last year
- [ICLR 2025] Pad: Personalized alignment of llms at decoding-time☆17Updated 9 months ago
- ☆17Updated 2 months ago
- Recent papers on (1) Psychology of LLMs; (2) Biases in LLMs.☆50Updated 2 years ago
- The reinforcement learning codes for dataset SPA-VL☆43Updated last year
- Official repository of the video reasoning benchmark MMR-V. Can Your MLLMs "Think with Video"?☆36Updated 6 months ago
- [AI4MATH@ICML2025] Do Not Let Low-Probability Tokens Over-Dominate in RL for LLMs☆41Updated 7 months ago
- [ECCV 2024] Official PyTorch Implementation of "How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs"☆84Updated 2 years ago
- [ACL'25 Main] ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation☆74Updated last month
- Code and data for the paper: Competing Large Language Models in Multi-Agent Gaming Environments☆91Updated 3 weeks ago
- 😎 curated list of awesome LMM hallucinations papers, methods & resources.☆150Updated last year
- ☆35Updated 3 weeks ago