open-compass / CompassJudger
☆87Updated last week
Alternatives and similar repositories for CompassJudger:
Users that are interested in CompassJudger are comparing it to the libraries listed below
- Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"☆221Updated 2 weeks ago
- ☆260Updated 7 months ago
- [ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement☆176Updated 11 months ago
- Reformatted Alignment☆114Updated 5 months ago
- The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"☆142Updated last month
- [ACL2024] T-Eval: Evaluating Tool Utilization Capability of Large Language Models Step by Step☆261Updated 11 months ago
- The official implementation of "Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks"☆53Updated 10 months ago
- The demo, code and data of FollowRAG☆70Updated 2 months ago
- [ACL 2024] AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning☆211Updated last month
- ☆91Updated 2 months ago
- Code and data for CoachLM, an automatic instruction revision approach LLM instruction tuning.☆60Updated 11 months ago
- [NeurIPS 2024] Agent Planning with World Knowledge Model☆114Updated 2 months ago
- [EMNLP 2024 (Oral)] Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA☆111Updated 3 months ago
- ☆99Updated 2 months ago
- ☆307Updated 5 months ago
- ☆99Updated last month
- This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for E…☆397Updated last month
- LongQLoRA: Extent Context Length of LLMs Efficiently☆163Updated last year
- Evaluating LLMs' multi-round chatting capability via assessing conversations generated by two LLM instances.☆145Updated last year
- Generative Judge for Evaluating Alignment☆229Updated last year
- [CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback☆267Updated 5 months ago
- [NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other mo…☆340Updated 5 months ago
- ☆120Updated 8 months ago
- InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning☆242Updated last year
- A series of technical report on Slow Thinking with LLM☆438Updated this week
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆91Updated this week
- Fantastic Data Engineering for Large Language Models☆75Updated 2 months ago