allenai / SciRIFF
Dataset and evaluation suite enabling LLM instruction-following for scientific literature understanding.
☆32Updated 2 months ago
Alternatives and similar repositories for SciRIFF:
Users that are interested in SciRIFF are comparing it to the libraries listed below
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆69Updated 2 months ago
- ☆31Updated last year
- Embedding Recycling for Language models☆38Updated last year
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆46Updated last year
- Few-shot Learning with Auxiliary Data☆26Updated last year
- ☆38Updated 10 months ago
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆46Updated last year
- ReBase: Training Task Experts through Retrieval Based Distillation☆28Updated 2 weeks ago
- [ACL 2024] <Large Language Models for Automated Open-domain Scientific Hypotheses Discovery>. It has also received the best poster award …☆38Updated 3 months ago
- [arXiv preprint] Official Repository for "Evaluating Language Models as Synthetic Data Generators"☆34Updated 2 months ago
- ☆44Updated 3 months ago
- FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions☆42Updated 7 months ago
- ☆19Updated 4 months ago
- Tasks for describing differences between text distributions.☆16Updated 6 months ago
- ☆17Updated 4 months ago
- ☆26Updated 7 months ago
- Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval☆40Updated 3 months ago
- ☆66Updated last year
- ☆39Updated 2 years ago
- ☆48Updated last year
- Starbucks: Improved Training for 2D Matryoshka Embeddings☆17Updated 2 weeks ago
- ☆40Updated last week
- Data and code for the preprint "In-Context Learning with Long-Context Models: An In-Depth Exploration"☆31Updated 6 months ago
- ☆72Updated 9 months ago
- Aioli: A unified optimization framework for language model data mixing☆20Updated last month
- Repository for "Attribute First, then Generate: Locally-attributable Grounded Text Generation", ACL 2024☆28Updated 2 months ago
- PyTorch building blocks for the OLMo ecosystem☆54Updated this week
- Code/data for MARG (multi-agent review generation)☆38Updated 3 months ago
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆42Updated last year
- ☆34Updated 10 months ago