A general framework used on evaluating the performance of large language models (LLMs) based on the peer review mechanism among LLMs
☆19Aug 3, 2024Updated last year
Alternatives and similar repositories for PRE
Users that are interested in PRE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An evaluation framework to test AI in a trial-and-error process. It is a simplified Natural Selection test.☆22Mar 11, 2025Updated last year
- The homepage for ConvSearch Dataset.☆14May 31, 2022Updated 4 years ago
- LLM with LuXun (鲁迅) style☆91May 15, 2023Updated 3 years ago
- ☆15Jul 25, 2025Updated 11 months ago
- Code for AAAI 2024 paper Wikiformer☆20Dec 21, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- word2vec java版本的一个实现☆10Apr 24, 2016Updated 10 years ago
- ☆32Jul 4, 2022Updated 4 years ago
- Code for I3 Retriever, accepted by CIKM'23.☆53Oct 22, 2023Updated 2 years ago
- CIKM 2022: Evaluating Interpolation and Extrapolation Performance of Neural Retrieval Models☆10Aug 4, 2022Updated 3 years ago
- ☆13Nov 9, 2021Updated 4 years ago
- ☆13May 11, 2021Updated 5 years ago
- The official repo for our SIGIR'23 Full paper: Constructing Tree-based Index for Efficient and Effective Dense Retrieval☆28Jun 7, 2023Updated 3 years ago
- Code for KERM: Incorporating Explicit Knowledge in Pre-trained Language Models for Passage Re-ranking, accepted at SIGIR 2022.☆19Oct 31, 2022Updated 3 years ago
- The official repo for our paper: LegalAgentBench: Evaluating LLM Agents in Legal Domainl☆48Apr 10, 2026Updated 2 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- StaRD: Statute Retrieval Dataset based on Real-World Legal Consultation☆24Apr 24, 2025Updated last year
- Code to reproduce THUIR‘s submissions for COLIEE 2023 Task1 and Task2☆28May 12, 2023Updated 3 years ago
- Official code space for "SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development"☆60Oct 24, 2025Updated 8 months ago
- Large Visual Language Model(LVLM), Large Language Model(LLM), Multimodal Large Language Model(MLLM), Alignment, Agent, AI System, Survey☆21Jul 27, 2025Updated 11 months ago
- WSDM'22 Best Paper: Learning Discrete Representations via Constrained Clustering for Effective and Efficient Dense Retrieval☆119Aug 7, 2024Updated last year
- ☆17Jul 18, 2022Updated 3 years ago
- ☆13Oct 28, 2024Updated last year
- The implementation of deep learning models for EEG classification.☆15Apr 21, 2023Updated 3 years ago
- 采用bert进行事件抽取,[cls]进行事件分类,最后一层向量进行序列标注,两个任务同时训练。☆12Jun 7, 2021Updated 5 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- deepspeed+trainer简单高效实现多卡微调大模型☆133May 27, 2023Updated 3 years ago
- An easy-to-use python toolkit for flexibly adapting various neural ranking models to target domain.☆60May 17, 2023Updated 3 years ago
- SimKO: Simple Pass@K Policy Optimization☆31Oct 24, 2025Updated 8 months ago
- Test-time compute in information retrieval☆59Jul 8, 2025Updated 11 months ago
- personalized product search with product reviews☆17Feb 1, 2023Updated 3 years ago
- Network timing evaluation used to detect beacons, works with argus flow as the source☆20May 4, 2016Updated 10 years ago
- ☆47Apr 9, 2025Updated last year
- ☆16May 8, 2021Updated 5 years ago
- Hybrid List Aware Transformer Reranking☆20Oct 25, 2022Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Fast whitespace correction with Transformers☆18Aug 22, 2025Updated 10 months ago
- ☆12Jul 4, 2022Updated 4 years ago
- Confidence Regulation Neurons in Language Models (NeurIPS 2024)☆15Feb 1, 2025Updated last year
- Official codebase for NeurIPS 2022 paper End-to-end Learning to Index and Search in Large Output Spaces☆12Apr 19, 2023Updated 3 years ago
- Truly Conversational Search is the next logic step in the journey to generate intelligent and useful AI. To understand what this may mean…☆115Jun 12, 2023Updated 3 years ago
- [WSDM 2024 Best Paper Honorable Mention] Debiasing Sequential Recommenders through Distributionally Robust Optimization over System Expos…☆16Jun 20, 2024Updated 2 years ago
- ☆16May 22, 2022Updated 4 years ago