sail-sg / sailcraftLinks
🚢 Data Toolkit for Sailor Language Models
☆94Updated 7 months ago
Alternatives and similar repositories for sailcraft
Users that are interested in sailcraft are comparing it to the libraries listed below
Sorting:
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆148Updated 11 months ago
- LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)☆143Updated 10 months ago
- ☆155Updated last year
- Codebase accompanying the Summary of a Haystack paper.☆79Updated last year
- ☆127Updated last year
- Official implementation for 'Extending LLMs’ Context Window with 100 Samples'☆80Updated last year
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆98Updated 10 months ago
- [ACL 2025] AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark☆156Updated 2 months ago
- Reformatted Alignment☆113Updated last year
- BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent☆84Updated this week
- FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions☆47Updated last year
- [NeurIPS 2023] This is the code for the paper `Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias`.☆153Updated last year
- Benchmarking LLMs with Challenging Tasks from Real Users☆241Updated 11 months ago
- Retrieval Augmented Generation Generalized Evaluation Dataset☆56Updated 2 months ago
- ☆68Updated 2 years ago
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆133Updated last year
- Code for EMNLP 2024 paper "Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning"☆55Updated last year
- Official repository for paper "ReasonIR Training Retrievers for Reasoning Tasks".☆202Updated 3 months ago
- This is the official repository for Inheritune.☆113Updated 7 months ago
- ☆62Updated last year
- Complex Function Calling Benchmark.☆135Updated 8 months ago
- [ACL 2025 Findings] Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts (As Huggingface Daily Papers: …☆86Updated 3 weeks ago
- ☆74Updated last year
- ☆48Updated last year
- This repository contains the joint use of CPO and SimPO method for better reference-free preference learning methods.☆56Updated last year
- Code for Zero-Shot Tokenizer Transfer☆137Updated 8 months ago
- This is the repository for our paper "INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning"☆204Updated 9 months ago
- MultilingualSIFT: Multilingual Supervised Instruction Fine-tuning☆94Updated 2 years ago
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆206Updated last year
- Finetune mistral-7b-instruct for sentence embeddings☆87Updated last year