WebQnA / WebQA
☆49Updated 2 months ago
Alternatives and similar repositories for WebQA:
Users that are interested in WebQA are comparing it to the libraries listed below
- ☆30Updated 10 months ago
- ☆85Updated 2 years ago
- ☆36Updated 11 months ago
- ☆31Updated last year
- ☆17Updated last year
- Code for M4LE: A Multi-Ability Multi-Range Multi-Task Multi-Domain Long-Context Evaluation Benchmark for Large Language Models☆22Updated 7 months ago
- ☆61Updated 2 years ago
- [EMNLP 2022] Code and data for "Controllable Dialogue Simulation with In-Context Learning"☆34Updated 2 years ago
- TBC☆26Updated 2 years ago
- Supporting code for ReCEval paper☆28Updated 6 months ago
- [NeurIPS 2022] Non-Linguistic Supervision for Contrastive Learning of Sentence Embeddings☆21Updated 2 years ago
- Methods and evaluation for aligning language models temporally☆27Updated last year
- ☆121Updated 2 years ago
- [ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning☆24Updated last year
- ☆25Updated 2 years ago
- NAACL 2022: MCSE: Multimodal Contrastive Learning of Sentence Embeddings☆55Updated 9 months ago
- Implementation of "Visualize Before You Write: Imagination-Guided Open-Ended Text Generation".☆17Updated 2 years ago
- ☆14Updated last year
- Code and data for paper "Context-faithful Prompting for Large Language Models".☆39Updated last year
- VaLM: Visually-augmented Language Modeling. ICLR 2023.☆56Updated 2 years ago
- Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)☆58Updated last year
- code for the table-based open domain question answering project, with paper title: "Reasoning over Hybrid Chain for Table-and-Text Open D…☆12Updated 2 years ago
- Momentum Decoding: Open-ended Text Generation as Graph Exploration☆19Updated 2 years ago
- [NeurIPS 2023] Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective☆30Updated last year
- Resources for Retrieval Augmentation for Commonsense Reasoning: A Unified Approach. EMNLP 2022.☆21Updated 2 years ago
- Visual and Embodied Concepts evaluation benchmark☆21Updated last year
- ☆48Updated 11 months ago
- ReCross: Unsupervised Cross-Task Generalization via Retrieval Augmentation☆24Updated 2 years ago
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.☆54Updated 8 months ago
- The official repository for the paper "From Zero to Hero: Examining the Power of Symbolic Tasks in Instruction Tuning".☆64Updated last year