dair-iitd / jeebenchLinks
JEEBench, EMNLP 2023
☆43Updated last year
Alternatives and similar repositories for jeebench
Users that are interested in jeebench are comparing it to the libraries listed below
Sorting:
- ☆23Updated 3 weeks ago
- Resources for cultural NLP research☆106Updated last month
- ☆36Updated 2 years ago
- Discovering Data-driven Hypotheses in the Wild☆117Updated 5 months ago
- [ACL 2024] <Large Language Models for Automated Open-domain Scientific Hypotheses Discovery>. It has also received the best poster award …☆42Updated last year
- ☆69Updated last year
- Sparse and discrete interpretability tool for neural networks☆64Updated last year
- ☆29Updated last year
- Official implementation of the ACL 2024: Scientific Inspiration Machines Optimized for Novelty☆87Updated last year
- This is the official repository for the "Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP" paper acce…☆23Updated last year
- ☆164Updated 11 months ago
- Official implementation of "BERTs are Generative In-Context Learners"☆32Updated 8 months ago
- 👻 Code and benchmark for our EMNLP 2023 paper - "FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions"☆56Updated last year
- Minimum Description Length probing for neural network representations☆20Updated 9 months ago
- ☆129Updated last year
- Arrakis is a library to conduct, track and visualize mechanistic interpretability experiments.☆31Updated 6 months ago
- Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"☆86Updated last year
- ☆78Updated last year
- Embedding Recycling for Language models☆38Updated 2 years ago
- ☆44Updated last year
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…☆75Updated last year
- Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data☆43Updated 9 months ago
- ☆21Updated 5 months ago
- ☆10Updated last year
- ☆24Updated 7 months ago
- The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptions☆71Updated 2 years ago
- ☆111Updated 9 months ago
- ☆29Updated 3 weeks ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆49Updated 2 years ago
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆71Updated last year