compling-wat / ura-practice
Practice tasks for the CompLING lab internship application.
☆7Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for ura-practice
- The MiniAgents visualization tool for simulacra.☆13Updated 7 months ago
- Official Code Repository for LM-Steer Paper: "Word Embeddings Are Steers for Language Models" (ACL 2024 Outstanding Paper Award)☆63Updated last month
- The official repository of the Omni-MATH benchmark.☆52Updated 2 weeks ago
- ☆95Updated last week
- Sotopia-π: Interactive Learning of Socially Intelligent Language Agents (ACL 2024)☆50Updated 6 months ago
- [EMNLP 2024 (Oral)] Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA☆93Updated last week
- This is the official repository of the paper "OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI"☆86Updated last month
- 😎 up-to-date & curated list of awesome LMM hallucinations papers, methods & resources.☆146Updated 7 months ago
- A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..☆175Updated last month
- A Survey on the Honesty of Large Language Models☆46Updated last month
- LOFT: A 1 Million+ Token Long-Context Benchmark☆146Updated 3 weeks ago
- This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.or…☆107Updated 4 months ago
- The Paper List on Data Contamination for Large Language Models Evaluation.☆76Updated this week
- Code and data for "ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM" (NeurIPS 2024 Track Datasets and…☆29Updated last month
- ☆54Updated 2 months ago
- [NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"☆84Updated 8 months ago
- MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts☆240Updated 2 months ago
- Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)☆110Updated last month
- Data and code for the paper "NormBank: A Knowledge Bank of Situational Social Norms"☆24Updated last year
- EMNLP 2023 Papers: Explore cutting-edge research from EMNLP 2023, the premier conference for advancing empirical methods in natural langu…☆100Updated 6 months ago
- Accelerating the development of large multimodal models (LMMs) with lmms-eval☆11Updated last month
- [ACL 2024]Official GitHub repo for OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scie…☆96Updated 4 months ago
- Data and code for our paper "Why Does the Effective Context Length of LLMs Fall Short?"☆64Updated last week
- Multimodal language model benchmark, featuring challenging examples☆149Updated 3 months ago
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"☆61Updated 7 months ago
- Paper collections of methods that using language to interact with environment, including interact with real world, simulated world or WWW…☆123Updated last year
- ☆22Updated last year
- Evaluating Mathematical Reasoning Beyond Accuracy☆37Updated 7 months ago
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆119Updated 3 weeks ago
- A curated list of awesome resources dedicated to Scaling Laws for LLMs☆63Updated last year